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STABILITY OF JOIN THE SHORTEST QUEUE 
NETWORKS 

By Maury Bramson* 

University of Minnesota, Twin Cities 

Join the shortest queue (JSQ) refers to networks whose incoming 
jobs are assigned to the shortest queue from among a randomly cho- 
sen subset of the queues in the system. After completion of service 
at the queue, a job leaves the network. We show that, for all non- 
idling service disciplines and for general interarrival and service time 
distributions, such networks are stable when they are subcritical. We 
then obtain uniform bounds on the tails of the marginal distribu- 
tions of the equilibria for families of such networks; these bounds 
are employed to show relative compactness of the marginal distribu- 
tions. We also present a family of subcritical JSQ networks whose 
workloads in equilibrium are much larger than for the corresponding 
networks where each incoming job is assigned randomly to a queue. 
Part of this work generalizes results in Foss and Chernova [12] , which 
applied fluid limits to study networks with the FIFO discipline. Here, 
we apply an appropriate Lyapunov function. 



1. Introduction. Join the shortest queue (JSQ) refers to networks 
whose incoming "jobs" (or "customers") are assigned to the shortest queue 
from among a randomly chosen subset of queues in the system. Here, short- 
est queue means the queue with the fewest jobs. (One could also consider 
the queue with the least remaining work.) Jobs are assumed to arrive in the 
network through one or more random streams of jobs. An arriving job is 
presented with a random subset B of the queues, with probability depend- 
ing only on the stream, and chooses the queue in B with the fewest jobs; 
when two or more queues have the fewest jobs, one of these queues is chosen 
according to some rule. Jobs at queues are served according to some disci- 
pline, such as first-in, first-out (FIFO), last-in, first-out (LIFO), or processor 
sharing (PS) and, upon completion of service, leave the network. 

A family of such networks is given by the mean field rule where \B\ = 
D, D > 1, is fixed, and B is chosen uniformly from among the ( D ) such 
sets, where N is the total number of queues. An alternative rule is given 
by choosing D queues with replacement from among the N queues, as in 
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Vvedenskaya, Dobrushin and Karpelevich [22]. A natural setting for both 
rules is where all jobs arrive at a single Poisson stream with rate aN, with all 
jobs having the same service distribution F(-), with mean to, and are served 
according to the same discipline at each queue. Such networks are subcritical 
when am < 1, that is, the long term rate at which jobs arrive in the network 
is strictly less than the total rate at which jobs are served when no queues are 
empty. When am < 1 and F(-) is exponentially distributed, it is elementary 
to show that the network is stable, that is, the underlying Markov process is 
positive Harris recurrent. It will therefore have an equilibrium distribution. 
In this setting, the choice of service discipline (when not relying on the 
residual service times) does not affect the distribution of the number of 
jobs at each queue, and the state space can be chosen so that it depends 
only on the number of jobs at each queue. When D = 1, we will say that 
each incoming job is assigned randomly to a queue; when, in addition, F(-) is 
exponentially distributed, the network will consist of ./V independent M/M/l 
queues. 

The asymptotic behavior of the equilibria of such networks, with D fixed 
as N — > oo, has been studied since the mid 1990s. Let E^(-) denote the 
equilibrium distribution function at a single queue for the network of N 
queues. In [22], it was shown that, for p = am < 1, 

(1.1) lim m\£) = p i+n+~+° e = p {^-x)/{D-x) for £ G z 

where E^ N \-) = 1 — E^ N \-), and D > 1 is required for the second equality. 
Hence, the tail of limjv_ 5 . 00 E^ N \-) decreases doubly exponentially fast when 
D > 1; when D = 1, the exponential tail is that of the corresponding 
M/M/l queue. This rapid decrease in the tail has different applications, 
such as in the design of complex networking systems where memory is at a 
premium. See Azar, Broder, Karlin and Upfal [1], Luczak and McDiarmid 
[13, 14], Martin and Suhov [15], Mitzenmacher [17], Suhov and Vvedenskaya 
[20], Vocking [21], and Vvedenskaya and Suhov [23] for related work on JSQ 
networks and ball-bin models in both theoretical and applied contexts. 

The study of networks with given N has been more restricted. Foley and 
McDonald [11] studied the equilibria for small values of N. 

Little work has been done on networks with nonexponential service dis- 
tributions. In this setting, the stability of subcritical networks is no longer 
obvious. In particular, jobs might be assigned to short queues where the re- 
maining work (or workload) is high, which can cause service inactivity after 
queues with many jobs, but low remaining work, empty. If the system can 
be "tricked" too often in this manner, it is conceivable that it is unstable 
while nevertheless being subcritical. For general service distributions, the 
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evolution of the system will be influenced by the service discipline, which 
complicates analysis. 

For JSQ networks with general service times, Foss and Chernova [12] is the 
main work that analyzes stability. Under the FIFO discipline, stability for a 
broad family of subcritical networks is demonstrated in [12], including those 
with the JSQ rule, for general service distributions and for arrivals given 
by a single renewal stream. Fluid limits are employed as the main tool. In 
this more general framework, the appropriate definition for subcritical is no 
longer transparent. It will be discussed in the next subsection. 

For our results, we adopt the same basic framework as in [12] for JSQ 
networks, but instead consider general service disciplines. We also allow 
multiple arrival streams. For general service disciplines, the number of par- 
tially served jobs may be large and known fluid limit techniques cannot be 
applied. Instead, we employ an appropriate Lyapunov function. 

In this paper, we first show stability of subcritical networks for all non- 
idling disciplines. We then obtain uniform bounds on the tails of the marginal 
distributions of the equilibria for families of such networks; these bounds are 
employed to show relative compactness of the marginal distributions. Both 
the uniform bounds and relative compactness will be important tools for 
investigating, in the mean field setting, the limiting behavior, as N — > oo, 
of the equilibria distributions E^ N \-) at single queues for service disciplines 
such as FIFO, LIFO and PS (see Bramson, Lu and Prabhakar [6, 7]). Ap- 
propriate analogs of (1.1), for large values of £, will hold under certain re- 
strictions. We lastly present a family of subcritical mean field networks, with 
N = D = 2, where the service discipline is chosen so that the correspond- 
ing equilibria have much larger workload than do the corresponding M/G/l 
queues. This shows that the JSQ rule does not always provide efficient ser- 
vice of jobs. 

Main results. We present our main results here, Theorems 1.1, 1.2 and 
1.3, and discuss their ramifications. In the next subsection, we will give a 
general outline of the paper. 

In order to avoid technical details, we postpone until Section 2 details 
regarding the construction of the state space S and of the Markov pro- 
cess X(t), t > 0, underlying a JSQ network. We require at this point only 
limited specifics about the construction, namely that a state x S S is spec- 
ified by descriptors that include the number of jobs z n at each queue n, 
n = 1, . . . , N; the residual interarrival times Uk, k = 1, . . . , K, at each of 
the arrival streams, which are given by independent renewal processes; the 
residual service times v n> i, n = 1,...,N and i = 1, . . . , z n , for each of the 
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jobs currently in the network; and the ages o n ^ for each of these jobs. (In 
the construction of S in Section 2, normalized versions of u^, v n ^ and o n ^ 
are used.) When the arrival streams are Poisson, or when the service time 
distributions are the same and exponentially distributed, the correspond- 
ing descriptor may be dropped. The underlying Markov process X(-) takes 
values in S and is strong Markov. 

In order to state our results, we need to specify the notion of subcriticality 
that was mentioned in the context of [12]. This requires the introduction of 
various terminology. We denote by Gk(-), k = 1,...,K, the distribution 
function for the interarrival time of jobs at the k th renewal stream, and by 
afc the reciprocal of its mean, with > being assumed. We denote by 
Pk,B the probability that a job from arrival stream k chooses the shortest 
queue from the set B, B C Bjy, where Bjy = {1, • • ■ ,N}. We refer to B 
as the selection set and the rule corresponding to a given choice of Pk,B, 
k = 1, . . . ,K, B C Bn, as the selection rule. For n 6 B, such a job is a 
potential arrival at n. 

We denote by Fj(-), with j = (k,B,n) for k = 1, . . . , K , B C Bjy and 
n £ B, the distribution function for the service time (i.e., service require- 
ment) of jobs from the fe th renewal stream and selection set B that are served 
at queue n; by mj, the mean of Fj(-); and, by p,j = 1/rrij, the corresponding 
service rate. As in [12], we will require that either (a) Fj(-) depend only on 
n or (b) Fj(-) depend only on k and B, in which case we may write either 
F n (-) or Fk,B(-) (and m n or m^B, respectively, \i n or [J-k,B) when the con- 
text is clear. (When neither (a) nor (b) holds, analysis is more complicated 
and, as explained in Section 6 of [12], stability likely does not follow from 
subcriticality.) In [6], [7] and [22], Fj(-) = F(-) does not depend on k, B or 
iV and there is a single renewal stream; in this setting, one can employ the 
notation a, m and fx. As in [12], we refer to networks satisfying (a) as class 
independent and (b) as station independent. 

For the class independent case, we define the traffic intensity 



When pi < 1, respectively, p2 < 1, we say the network is subcritical. As was 
observed in [12], it is not difficult to check that when pi > 1 in either (1.2) 



(1.2) 




and, for the station independent case, 



(1.3) 
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or (1.3), the corresponding network will be unstable. This behavior does not 
depend on the service discipline. 

When the network is both class and station independent, (1.2) and (1.3) 
reduce to 

/ -i a \ def 

(1.4) p = pi = p 2 = max 

BCB N 

Let H be a subgroup of the permutation group on Bn on which all queues 
communicate (i.e., for given n\,n 2 G -£>tv, 7r(ni) = n 2 for some tt G H), and 
assume the symmetry condition 

( L5 ) ^2 ak P k > A = XT ° k P k > A for a11 B - B N and n £ H 

k ACBir k ACB 

is satisfied, where B n = {n : n = 7r(n') for some n' G B}. (This holds, in 
particular, in the mean field setting.) Also, assume the network is both class 
and station independent. Then it is not difficult to check that (1.4) reduces 
to 




(Note that, for given A, A C B n for at most \H\\B\/N permutations tt £ H, 
with equality holding when A is a singleton.) 

In addition to subcriticality, we will require the following condition on 
JSQ networks for Theorems 1.1, 1.2 and 1.3: for some V G Z +; o and h(-), 
with h(t) > for all t, 

(1.7) P x (at most T potential arrivals at n occur over (0, m max t]) > h(t) 

def 

for all x, t and n, where m max = maxj mj. Note that this condition depends 
only on the distributions Gfc(') and probabilities Pk A- It is met in most cases, 
for instance, if (a) for each k and n, YliA^nPkA < 1) m which case one can 
set T = 0, or (b) for each k and y, Gk(y) < 1, where r = iST + 1 suffices. 
Condition (a) always holds in the mean field setting when D < N; Condition 
(b) is equivalent to (1.11), which is required for Corollary 1.1. 

In Theorems 1.1, 1.2 and 1.3, we will employ the nonnegative function 
||x||, or norm, for x G S. It is defined in terms of the norms \\x\\r and 

\\x\\a by 



m \ - x - 



k ACB 



(1.8) 
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We define these components of ||x|| in Section 4. Without going into details 
here, we note that \\x\\l depends on the number of jobs and a truncation of 
the residual service time of each job; \\x\\r depends on just the residual ser- 
vice time of each job and will be employed for residual service times greater 
than the previous truncation; and \\x\\a measures the residual interarrival 
times with appropriate weighting. For given M > 0, we denote by Tjvf(l) the 
stopping time 

(1.9) r M (l) = inf{* > 1 : \\X(t)\\ < M}. 

We now state Theorem 1.1. Here and elsewhere in the paper, we implicitly 
assume the discipline is non-idling. In Section 2, we will also specify mild 
conditions on how the service effort devoted to individual jobs is allowed to 
change over time. 

Theorem 1.1. For each subcritical JSQ network satisfying (1.7), there 
exist M and C\ so that 

(1.10) E x [t m (1)] < ddlxW V 1) for all x, 

where \\x\\ is the norm given in (1.8). 

For certain service disciplines, such as FIFO and PS, the condition (1.7) 
can be avoided; this is noted after the proof of Proposition 5.1. As men- 
tioned above, (1.7) will in most cases be satisfied irrespective of the service 
discipline. 

The condition (1.10) will imply the positive Harris recurrence of X(-) 
provided that the states in the state space S communicate with one another 
in an appropriate sense. Petite sets are typically employed for this purpose; 
they will be defined in Section 2. A petite set A has the property that each 
measurable set B is "equally accessible" from all points in A with respect 
to a given nontrivial measure. 

Theorem 1.2. Suppose that a JSQ network is subcritical, satisfies (1.7), 
and that Am = {x : ||x|| < M} is petite for each M > for the norm in 
(1.8). Then X() is positive Harris recurrent. 

Theorem 1.2 will follow from Theorem 1.1 by standard reasoning. More 
detail is provided in Section 2. 

A standard criterion that ensures the above sets Am are petite is given 
by the following two conditions on the interarrival times. In various works 
on stability (e.g., [3], [9] and [12]), these conditions are employed rather 
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than the more abstract notion of petite set. The first condition is that the 
distribution Gfe(-) is unbounded for each k, that is, 

(1.11) G k (y) = 1-G k (y) >0 for all y. 

The second condition is that, for some Ik £ Z+, the i^-idld convolution 
G* k k {-) of Gfc(-) and Lebesque measure are not mutually singular. That is, 
for some nonnegative <?&(') with q k (s)ds > 0, 

(1.12) G* k ik (d) - G* k ik (c) > J q k {s)ds 

for all c < d. When the inter arrival times are exponentially distributed, both 
(1.11) and (1.12) are immediate. More detail is given in Section 2. 

We therefore have the following corollary of Theorem 1.2. As noted earlier, 
(1.7) is automatic in this setting. 

Corollary 1.1. Suppose that a subcritical JSQ network has interarrival 
times that satisfy (1.11) and (1.12). Then X() is positive Harris recurrent. 

By employing Theorem 1.2 and the bounds obtained in the derivation 
of Theorem 1.1, one can obtain uniform bounds on the equilibria restricted 
to individual queues and to individual arrival streams for families of JSQ 
networks. Such a family A will be required to satisfy the following uniformity 
conditions on the service and interarrival distributions F^ a \-) and G k (•), 
for a € A: 

/■CXJ 

(1.13) sup max fj\ / yF^ a \dy)^0 as M — > oo 

aeA J JM/^ a) 



and 



/"OO 

(1.14) sup max ai' / yG k a \dy) ->■ as M — > oo. 

a£A k JM/a { k a) 



We require 



(1.15) sup^ a) <l 



aeA 

where = p± or pf^ = p£ , depending on whether the network is class 
independent or station independent. Setting 

^ i 1 for class independent networks, 

— — r^f- for station independent networks, 
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we also require that 



(1.16) m ratio = f sup fm ratio ) ' ' < oo. 

We also need a uniform version of (1.7), namely that, for some V S Z+ t o 
and h(-), with h(t) > for all t, 

(1.17) ' 

p( a \&t most r potential arrivals at n occur over (0, (m max )^]) > h(t) 

for all a € A, x, t and n. Here, P^t\-) denotes the transition kernel with 
respect to a. When the selection rules p^\ satisfy Pu\ = for \B\ > M for 
some M not depending on a, we will say the selection rules have uniformly 
bounded support. 

In many cases of interest, conditions (1.13)— (1.17) are not difficult to 
check. For instance, when each network has a single Poisson arrival stream 
and Fj (■) does not depend on j or a, all conditions except for (1.15) and 

(1.17) are automatic. When, in addition, the selection rules have uniformly 
bounded support, then (1.17) also holds. 

By Theorem 1.2, for any subcritical JSQ network a £ A, with Am pe- 
tite for M > 0, the underlying Markov process X^ a \-) is positive Harris 
recurrent; this follows by employing the analog of (1.8), 

(1.18) = \\x\ff + \\x\\^ + \\x\ff. 

In the statement of Theorem 1.3, we also employ the local norm at n = 
1 iV (a) , 



(a) 



(1.19) |x|W = z n + + u4 a) ^ x € 5 (a) . 

Here, z n is th 
at n, that is, 



Here, z n is the number of jobs at queue n and Wn is the weighted workload 



1=1 1=1 



where u^f 1 denotes the service rate of the i th iob at queue n and v n » is the 

Jn,i ' 

residual service time of this job. The term 1$ is the maximum weighted age 
at n, that is, 

(1.21) £^= max = max fifo^, 

£ J. , ... ,,671 t ±,...,^71. 
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where o n ^ is the age of the * job at queue n (the time since its arrival at 
n). We set 

(1.22) sf = a k a) u k , 

where u k is the residual inter arrival time at the arrival stream k; we refer 
to si as the weighted interarrival time. (Since x and z do not explicitly 
depend on a £ A, the corresponding superscripts are omitted.) We will 
employ, in Section 4, some of the terminology and conditions introduced in 
(1.13)— (1.22), when demonstrating Theorem 1.1 for a given JSQ network. 

Since X^ a \-) is positive Harris recurrent, it will have a unique equilib- 
rium measure on S^ a \ which we denote by £^ a \ In order to ensure that the 
marginal distribution at a given queue n does not depend on n, we also as- 
sume that, for some subgroup of the permutation group on B N ( a ) +K ( a ), 
with queues being mapped to queues, arrival streams to arrival streams and 
on which all queues (but not necessarily all arrival streams) communicate, 

(1.23) X^() and X^(-) are stochastically equivalent for all vr £ . 

(That is, Xi a \-) and X^(-) have the same joint distributions.) Here, X^f 1 (•) 
is the stochastic process induced from X^O by permuting the queues and 
arrival streams according to ir. We will call such a JSQ network symmetric. 
As an example of such a JSQ network, one can consider N queues arranged 
uniformly along a circle, along with ./V arrival streams that alternate with 
the queues, with jobs arriving at a given stream by selecting the shortest 
queue within a preassigned distance of the stream, and the service discipline 
and service and interarrival distributions not depending on the queue, arrival 
stream or selection set. 

Employing the preceding conditions, we now state our third main result. 
Here and later on, £^() denotes the probability of the indicated event with 
respect to the measure £^ a \ and X and denote the random variables 
corresponding to x and s^ . (The random variables = (s[ a \ . . . ,S$) 
should not be confused with the state space.) 

Theorem 1.3. Suppose that a family A of JSQ networks satisfies the 
uniformity conditions (1.13)— (1.17) and (1.23), and that, for each network 
a a A, Am = {x : ||sc||^ a ' < M} is petite for each M > with respect to the 
norms in (1.18). Then 

(1.24) sup£ (a) (l^ln a) >M)->-0 osM^oo, 
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for each n, and 

(1.25) sup max^ a '(s[ a) > M) -> as M -> oo. 
ae.4 ft 

The limits (1-24) and (1-25) supply uniform bounds on the number of jobs, 
maximum weighted age and weighted workload at each individual queue n, 
and on the weighted inter arrival times at each arrival stream k. When these 
limits hold, we say the equilibria £ ( a \ for a € A, are locally bounded. Note 
that, on account of (1.23), the probabilities in (1.24) do not depend on n. 

In many cases of interest, the conditions in Theorem 1.3 are not difficult 
to check. As mentioned in conjunction with (1.13)— (1.17), when each mem- 
ber of A has a single Poisson arrival stream, Fj(-) does not depend on j 
or a, and the selection rules have uniformly bounded support, then all of 
these properties except (1.15) hold. Since (1.11) and (1.12) hold for Poisson 
arrivals, the sets Am are petite. Also, under (1.23), the traffic intensity can 
be written as in (1.6). We therefore obtain the following corollary of The- 
orem 1.3. Recall that a selection rule is mean field if, for a given D, each 
nonrepeating D-tuple is chosen with equal probability from among the 
queues, and note that JSQ networks with a mean field selection rule and for 
which Fj does not depend on j or a are also symmetric. 

Corollary 1.2. Suppose that each member of a family A of JSQ net- 
works has a single Poisson arrival stream, that Fj does not depend on j 
or a, that the selection rules have uniformly bounded support, and that 

(1.26) supa (a) m/7V (a) < 1. 

If (1.23) is satisfied, then (1.24) an d (1-25) hold. In particular, if the selec- 
tion rules are mean field, then (1.24) and (1.25) hold. 

As mentioned earlier, when a JSQ network has a single Poisson arrival 
stream, one can omit the interarrival times from the state space descrip- 
tor. In this case, the limit (1.25) is no longer relevant in Theorem 1.3 and 
Corollary 1.2. 

When A is given by a family of networks indexed by the number N of 
queues, Theorem 1.3 provides local bounds on £( N ) as N — > oo. These 
bounds can be used to show the relative compactness of the restriction of 
£( N ) to finite sets of queues; this is done in Theorem 6.1. As mentioned 
earlier, these local bounds and relative compactness of the sequence provide 
a framework for approximating the corresponding marginal distributions 
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for large N ([6, 7]). In this context, one employs appropriate mean field 
equations corresponding to the marginal distributions of the equilibrium 
of a limiting infinite system. Under appropriate assumptions on the 
service times, the solutions of these mean field equations satisfy bounds on 
the number of jobs that are on the same order as in the limit in (1.1). 

In Section 7, we present a family of mean field, symmetric networks, with 
a single Poisson arrival stream, N = D = 2, and an appropriate service 
discipline that illustrates how the JSQ rule can produce equilibria for which 
the typical workload is incredibly large, much larger than the workload for 
the analogous network with D = 1. So, in terms of workload, the JSQ rule 
can sometimes yield poor results. 

Outline of the paper and main ideas. In Section 2, we will provide a 
brief background on Markov processes that will be relevant to the space S 
and Markov process X(-) employed in the introduction, and we will pro- 
vide a more detailed construction of S and X(-). We will also provide the 
background that is needed to derive Theorem 1.2 from Theorem 1.1 and to 
obtain Corollary 1.1. The machinery for this is standard in the context of 
queueing networks and is easily modified so as to apply to JSQ networks. 

In Section 3, we provide an alternative formulation of the traffic intensities 
in (1.2) and (1.3) that we employ in successive sections. This formulation will 
enable us to compare JSQ networks to networks with appropriate random 
assignment of jobs to queues. 

Theorem 1.1 is demonstrated in Sections 4 and 5. The main tool is an 
appropriate Lyapunov function that is given in terms of the norms 
\\x\\r and \\x\\a in (1.8). Our analysis involves decomposing time into random 
intervals over which no jobs enter the network. Over each such interval, the 
evolution of the system is deterministic and ||A(i)|| is shown to be decreasing 
at large values. At times t where a job enters the network, the average 
value of ||X(i)|| — ||A(i— )|| is shown to be negative. Applying the strong 
Markov property and iterating over such intervals until time tm, tm = inf{i : 
||X(i)|| < M}, will imply (1.10) of Theorem 1.1. At the end of Section 5, we 
briefly discuss the networks mentioned at the beginning of the introduction 
where jobs are assigned to the queue with the least remaining work. We refer 
to such networks as join the least loaded queue (JLLQ) networks. The JLLQ 
rule is easier to handle than the JSQ rule. It is analyzed in [12] using fluid 
limits; here, we mention an alternative approach. At the end of the section, 
we also mention analogs of Theorem 1.1 where the state space S is modified. 

Theorem 1.3 is demonstrated in Section 6. The basic idea there is to 
employ estimates from Sections 4 and 5 to obtain lower bounds on the rate 
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at which ||A"( a )(i)||( a ) decreases at large values of \X^ a \t)\n . If these values 
occur with too high a probability at a given a G A, it will follow that, for 
appropriate t, 

B cW [||xW(t)||W]<^ w [||xW(0)||W]. 

Since £^ is an equilibrium, this is not possible, which gives us upper bounds 
on the probabilities in (1-24) and (1.25). Theorem 6.1 follows quickly from 
Theorem 1.3 and Prohorov's Theorem. 

In Section 7, we discuss the family of networks mentioned earlier, with 
N = D = 2, whose equilibria have very large workload. This behavior arises 
because of the manner in which the service discipline restricts service for 
jobs with large residual service time. The main result is given in Theorem 
7.1. Because of the length of the argument, we provide a condensed proof of 
part of the theorem. 

Notation. For the reader's convenience, we mention here some of the 
notation in the paper. The term x indicates a state in the state space S 
and the corresponding term X(t) (or X) indicates a random state at time 
t (or with respect to a given measure); quantities such as z n and Z n (t), 
and w n< i and W n< i(t) (or, Z n and W n A play analogous roles. The results in 
this paper are for class independent and for station independent networks; 
in order to avoid repeating all statements and proofs for the two cases, we 

o 

employ notation with the symbol o, such as Tn^^, which will have different 
meanings in the two cases. The main norms with which we will work are 
II ' Hi II ' II • \\r and || • \\a- The symbols L, R and A also appear in other 
contexts, such as for the numbers Li, i = 1,...,4, service effort per job 
R Ut i and sets A; the meaning should always be clear from the context. The 
symbols Z + and denote the positive integers and positive real numbers, 
with Z +i o = Z + U {0} and M+,o = U {0}; [y\ denotes the integer part of 
y £ R+ and 1{A} denotes the indicator function of the event A. 

2. Markov process background. In this section, we provide a more 
detailed description of the construction of the Markov process X(-) that 
underlies a JSQ network. We then show how Theorem 1.2 and its corollary 
follow from Theorem 1.1. Analogs of this material for queueing networks are 
given in Bramson [4], and, for networks with weighted max-min fair policies, 
in Bramson [5]. Because of the similarity of these settings, we present a 
summary here and refer the reader to [4] for more detail. 
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Construction of the Markov process. We define the state space S to be 
the set 

(2.1) {J? x 2 Bn x R 3 )°° x R K 

subject to the following constraints. The components Sf., k = 1, . . . , K, of R K 
are all positive; they correspond to the residual interarrival times Uk of the 
K arrival streams, scaled by the arrival rates as in (1.22). Only a finite 
number of the 7-tuples of coordinates of (Z 3 x 2 Bn x M 3 ) are nonzero. 
Such a 7-tuple corresponds to a particular job in the network: the first 
coordinate n, n = 1,...,N, corresponds to the queue of the job and the 
second coordinate i, i = 1, . . . , z n , gives its rank at the queue based on the 
time of arrival there, with "older" jobs receiving a lower rank. The third and 
fourth coordinates k, k = 1, . . . , K, and A, ^ A C Bjy, correspond to the 
arrival stream and selection set of the job when it entered its queue. The 
fifth coordinate £, I > 0, measures the age o of the job, scaled by fij ni , as 
in (1.21); the sixth coordinate w, w > 0, measures its residual service time 
v, scaled by fij ni , as in (1.20); and the last coordinate r, r G [0,1], is the 
current service effort devoted to the job. (If one wishes, one can introduce 
other coordinates in the state space descriptor, such as the elapsed time 
since the last arrival from each stream, or the amount of service already 
received by each job.) Since the discipline is assumed to be non-idling, the 
sum of the last coordinates for all jobs at a given nonempty queue must 
equal 1. Note that for given N and K, the state space S constructed in this 
manner is unique. 

The last five coordinates may be considered as functions of the first two, 
and written as k n ,i-> A n< i, £ n j, w nt i and r n ^. The third and fourth coordinates, 
k Uj i and A n> i, are needed because of how || • || is defined in Section 4 and can 
be omitted for class independent networks. Various coordinates can also be 
omitted for particular service disciplines (such as FIFO). 

For given N' < N and K' = K, one can define S' as above, but with 
n = 1, . . . , N'. Then S' is the projection of S obtained by restricting nonzero 
7-tuples to the first N' queues. For x G 5, the projection x' G S' of x is 
the element obtained by omitting 7-tuples with n > N' . One can define 
projections of S onto spaces S' corresponding to other subsets of {1, ... , N} 
analogously, but we will not use these in the paper. 

Employing the above notation, we construct a metric d(-, ■) on S: for given 
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x, x' £ S, with the coordinates labelled correspondingly, we set 

N oo 

d(x,x') = ^2^2 -t' n ,i\ + \w n ,i ~ w' n ,i\ + \ r n,i - r' n J) A l) 

n=l i=l 

N oo 

(2-2) + £ + k 'nJ + HA^i + 

n=l i=l 
N K 

+ ^2\z n - z' n \ + ^2\s k - s' k \. 

n=l k=l 

One can check that d(-,-) is separable and locally compact; more detail is 
given on page 82 of [4]. One can also check that the sets Em C S, M > 1, 
defined by z n < M, £ n>i < M, 1/M < w n ^ < M and 1/M < s k < M, for all 
n = 1, . . . , N, k = 1, . . . , K and i = 1, . . . , z n are compact with respect to 
d(-, ■). We equip S with the standard Borel cr-algebra inherited from d(-, •), 
which we denote by 5? . In Lemma 4.1, we will show || • || • ||r and || • \\a 
are continuous in d(-, •). 

At the end of Section 6, we will employ the partial completion S of S 
that is obtained by allowing the weighted residual service times w n ^i to take 
values in [0, oo) rather than just (0,oo). Otherwise, the construction of S is 
the same as that just given for S. The metric d(-, •) is defined analogously to 
d(-,-), and the Borel cr-algebra 5? is defined correspondingly. Under d(-,-), 
the sets E M C S, M > 1, defined by z n < M, £ nji < M, < w n>i < M 
and 1/M < s k < M, for all n = 1, . . . ,N, k = 1, . . . , K and i = 1, . . . , z n 
are compact. One can define projections from S onto spaces S' in the same 
manner as was done from S onto 5'. 

The Markov process X(t), t > 0, underlying the network is defined to be 
the right continuous process taking values x in S whose evolution is deter- 
mined by the given JSQ rule together with the assigned service discipline. 
Jobs (n,i) are allocated service according to rates R n ^(t) (the service effort 
per job) that are assumed to be constant in between arrivals and departures 
of jobs at the queues. Over such an interval, C nt i(t) increases at rate /i Jni , 
and W nt i(t) and S k (t) decrease at rates fij ni R n) i(t) and a k , respectively. 
(We write C n ^(t) for the age functions to avoid possible confusion later with 
constants L.; that will be introduced.) Upon an arrival or departure, rates 
are re-assigned according to the discipline. The restriction made here for 
the discipline, that service rates remain constant between arrivals and de- 
partures of jobs, is for convenience, and allows one to inductively construct 
X(-) over increasing times in a simple way. The standard service disciplines 
satisfy this property. We also note that the construction here is not restricted 
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to the JSQ rule, and applies to other rules for assigning arriving jobs to a 
queue. 

Along the lines of page 85 of [4], a filtration (J^t), t G [0, oo], can be 
assigned to X(-) so that X(-) is Borel right and, in particular, is strong 
Markov. The processes X(-) fall into the class of piecewise-deterministic 
Markov processes, for which the reader is referred to Davis [10] for more 
detail. 

Recurrence. The Markov process X(-) is said to be Harris recurrent if, 
for some nontrivial <r-finite measure (p, 



where tjb = Jq°° l{X(t) G B}dt. If X(-) is Harris recurrent, it possesses a 
stationary measure tt that is unique up to a constant multiple. When ir is 
finite, X(-) is said to be positive Harris recurrent. 

A practical condition for determining positive Harris recurrence can be 
given by using petite sets. A nonempty set A G 5? is said to be petite if for 
some fixed probability measure a on (0, oo) and some nontrivial measure u 



for all x G A and B G 5?. Here, P\- ,•),*> 0, is the semigroup associated 
with X(-). As mentioned in the introduction, a petite set A has the property 
that each set B is "equally accessible" from all points x G A with respect to 
the measure v. Note that any nonempty measurable subset of a petite set is 
also petite. 

For given 5 > 0, set 



and T A = r A (0). Then t a (5) is a stopping time. Employing petite sets 
and t a (S), one has the following characterization of Harris recurrence and 
positive Harris recurrence. (The Markov process and state space need to 
satisfy minimal regularity conditions, as on page 86 of [4].) The criteria are 
from Meyn and Tweedie [16]; discrete time analogs of the different parts of 
the proposition have long been known, see, for instance, Nummelin [18] and 
Orey [19]. 

Theorem 2.1. (a) A Markov process X(-) is Harris recurrent if and 
only if there exists a closed petite set A with 



<p(B) > implies P x (i]B = oo) = 1 for all x G 5, 



on (S, y) 




t A (5) = inf {t > 5 : X(t) G A} 



(2.3) 



P x (t a < oo) = 1 forallxeS. 
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(b) Suppose the Markov process X(-) is Harris recurrent. Then X(-) is pos- 
itive Harris recurrent if and only if there exists a closed petite set A such 
that, for some 5 > 0, 

(2.4) sup E x [t a {5)] < oo. 

Theorem 1.2 and its corollary. Theorem 1.2 follows from Theorem 1.1, 
Theorem 2.1 and the elementary continuity result, Lemma 4.1. To see this, 
note that both conditions (2.3) and (2.4) of Theorem 2.1 are immediate 
consequences of (1.10) of Theorem 1.1, with A = Am, for appropriate M, 
and 5 = 1 since, in Theorem 1.2, Am is assumed to be petite. By Lemma 
4.1, the norm || • || in (1.8) is continuous in the metric d(-), and hence Am 
is also closed. It therefore follows from Theorem 2.1 that X(-) is positive 
Harris recurrent, which implies Theorem 1.2. 

Corollary 1.1 follows immediately from Theorem 1.2 and the assertion, 
before the statement of the corollary, that the sets Am are petite under the 
assumptions (1.11) and (1.12). A somewhat stronger version of the analogous 
assertion for queueing networks is demonstrated in Proposition 4.7 of [4]. 
(The proposition states that the sets A are uniformly small.) The reasoning 
is the same in both cases and does not involve the JSQ rule or the service 
discipline. The argument, in essence, requires that one wait long enough for 
the network to have at least a given positive probability of being empty; 
the time t does not depend on x for ||x|| < M. This will follow from (1.11) 
and the definition of || • || in Section 4, since the work in the network is 
bounded by a linear function of M. Since the residual interarrival times are 
also bounded by a linear function of M, by using (1.12), one can also show 
that the joint distribution function of the residual interarrival times has an 
absolutely continuous component at this time, whose density is bounded 
away from 0. It will follow that the set Am is petite with respect to v, with 
a chosen as the point mass at t, if v is concentrated on the empty states, 
where it is a small enough multiple of 1 1Z | -dimensional Lebesque measure 
restricted to a small cube. 

3. A useful routing lemma. In this section, we rephrase the condi- 
tions (1.2) and (1.3) that are used to define subcriticality for class inde- 
pendent and station independent JSQ networks by using sums that will be 
more convenient for us to work with when proving Theorem 1.1 in Sections 
4 and 5. For this, we employ Lemma 3.1. The desired sums for class indepen- 
dent and station independent JSQ networks are then given in the following 
corollary. 
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For the lemma, we employ the following notation. We consider f3k ; A,n > 
for k = 1, . . . , K, n = 1, . . . , N and A C Bn, where Bjy = {1, . . . , N}, and 
assume that, for each k and A, 

(3.1) Pk,A,n > iff /3fc,A,n' > for all n, n £ A, 

that is, whether or not fik,A,n is zero does not depend on n, for n £ A. 
For each S C Bjy, let {rB,n-,n = lj...,./V} be a probability distribution 
concentrated on B, with rg in > for n £ -B, so that for each fc, 1?, and 
ACS, and n restricted to B, 

(3.2) 7fc,A,s = f r B ,nPk,A,n does not depend on n. 

Lemma 3.1. Suppose that (3k,A,n, lk,A,B and rs,n are chosen as in (3.1)- 
(3.2), and moreover that "fk,A,B satisfies 

( 3 - 3 ) J2 ^k,A,B < P for all BCB N 

k ACB 

for some p. Then, for each k and A, there is a probability distribution 
{qk,A,n, n = !)•■■! N} concentrated on A, so that 

( 3 - 4 ) ^2 ^2 /3k,A, n qk,A,n < 9 f° r al1 n - 

k ACB N 

As a consequence of Lemma 3.1, we obtain the following two inequalities 
from (1.2) and (1.3). 

Corollary 3.1. (a) Suppose that (1.2) holds for a class independent 
JSQ network. Then, for each k and A, there is a probability distribution 
Qk,A,n concentrated on A so that 

(3.5) ^2 ^2 a kPk,Alk,A,n m n < Pi for all n. 

k ACB N 

(b) Suppose that (1.3) holds for a station independent JSQ network. Then, 
for each k and A, there is a probability distribution qk,A,n concentrated on 
A so that 

(3.6) ^2 ^2 a kPk,AQk,A,nmk,A < 92 for all n. 

k ACB N 
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Corollary 3.1 converts (1.2) and (1.3) into (3.5) and (3.6), which will be 
easier to work with in Section 4. When p\ < 1, respectively, p% < 1, these 
inequalities imply that the network, where incoming jobs are assigned the 
queue n with probabities qk,A,m is subcritical. On the other hand, by substi- 
tuting B for in the inner sum and summing over n, one can check that 
the inequalities in (3.5), respectively, (3.6), cannot be strict for all n. These 
conditions therefore give an alternative characterization for subcriticality 
when pi < 1. (We will not need this in the paper.) 

One can check that when the JSQ network is symmetric, one may set 
Qk,A,n = l/l^l f° r n £ A. This follows by summing the left side of (3.6) over 
all n and comparing the sum with the right side of (1.3), for B = Bjy. 

PROOF of Corollary 3.1. (a) We apply Lemma 3.1, setting 



It is easy to check that the conditions (3.1)-(3.2) are satisfied with this 
choice of fik,A,n and rB, n - Substitution of these quantities into (3.3) gives 
the quantity in braces on the right side of (1.2) for each choice of B, and 
substitution into (3.4) gives (3.5). Part (a) of the corollary therefore follows 
from the lemma. 

(b) The argument for this part is analogous to the first part; here, we set 

(3-8) fi k ,A,n = a k Pk,A m k,A' r B,n = l{n G B}/\B\, p = p 2 . 

It is easy to check that the conditions (3.1)-(3.2) are again satisfied. Substi- 
tution into (3.3) gives the quantity in braces on the right side of (1.3) and 
substitution into (3.4) gives (3.6). Part (b) of the corollary therefore also 
follows from the lemma. □ 

We now prove the lemma. 

Proof of Lemma 3.1. For a family of probability distributions qkAm 
indexed by k and A, and concentrated on A, set Q 9 ^ = f3k,A,nQk,A,ni 



(3.7) 




p= pi- 



rn 




and 



(3.10) 



V 



mm 



min V(q). 
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One can check that V(q) is continuous in q and that the set of q is compact 
with respect to the metric 

(3.11) d(q,q') = max \q k ,A,n ~ q'k,A,n\- 

k,A,n 

So, V mm is attained at some q mm . We set 

^min 



(3.12) 



A z 




where Ci^An ^ A nQ.'kAn- I n order to show (3.4), it suffices to show A mm = 
0. 

We first claim that for any k' , A' C and n\ E A mm , 
(3.13) A' C A min if (%% m > 0. 

We argue by contradiction and show that if (3.13) is violated for some k' , 
A' and n\, then for appropriate q, V(q) < V(q mm ), which is not possible. 
The proof of this does not use (3.3). 

For such A' and n\, n\ G A' since C™™' ■ ^ s concentrated on A' . We choose 
n,2 £ A' — A m [ n and define a new family of probability distributions q k a • by 

qk,A,n = Qk^An unless k = k , A = A' and either n = n\ or n = ri2, 



,iimi 



qk',A',m — Qk',A' 

Qk',A',n 2 — Qk',A',n 2 + e > 

where e > is small. 

For small enough e > 0, 



(3.14) 



k ACB N 



< EE C&»"P 



k ACB 



N 




where Cfc,A,n = f^k,A,nQk,A,n- The inequality is obvious when n\ 6 A mm , since 
the left side of (3.14) will be negative and the right side will be positive. 
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When n\ £ A mm — A mm , the inequality is also true since the left side is 
bounded above by a multiple of e 2 (with the second term being 0), and the 
right side is bounded below by a multiple of e. For n 7^ n\, ny, 



(3.15) \J2 £ Ck,A,n-p) = EE 

k ACB N J \ k ACB N 

So, by (3.14) and (3.15), V(q) < V(q min ), which contradicts the definition 
of q min . This shows (3.13). 

Employing (3.13), we now show that A min = 0. One has 



N 

p - E E ^k,A,A™» = E E E 

A: ACA min k ACA min n= l 

(3.16) =E E E r A^nPk,A 

k ACA min n&A mln 

= E r ^ min ,« E E ^M-' 



mm 
A,n 



The inequality follows from (3.3), the second equality follows from (3.2) since 
tb,- is concentrated on B, and the third equality follows from the definition 
of '(™A,n and (3-13). But, on account of the definitions of A min and A min , 
the last quantity in (3.16) is at least 

P E r A™™,n = P, 

neA min 

with strict inequality holding if A mm ^ because all the terms in the sum are 
strictly positive. Since the strict inequality contradicts (3.16), this implies 
A mm = 0, and hence the inequality in (3.4), as desired. □ 

4. Definitions of norms and basic inequalities. This section intro- 
duces the norms and provides the basic inequalities we will need in Sections 
5 and 6 for the proofs of Theorems 1.1 and 1.3. The section consists of two 
subsections. We first define the norms appearing in (1.8), || • || • \\r and 
|| • \\ A , in terms of which || • || was defined. We then state and prove Proposi- 
tions 4.1 and 4.2. These propositions give inequalities on the decrease of || • || 
and lie at the heart of the analysis in Sections 5 and 6. Proposition 4.2 is 
the only place in the first six sections of the paper where the JSQ property 
is employed. 
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Definition of the norms. In (1.8), we denned the norm || • || in terms of 
the norms || • \\ ■ \\r and || • \\a_- We now define these norms. 

Recall from Section 1 that z n denotes the number of jobs at queue n; w n 
denotes the weighted workload at the queue, which is defined in (1.20) along 
with w n ^i] r n ^ denotes the service effort for job (n,i); and s& denotes the 
weighted interarrival time at the arrival stream k, and is given by (1.22). 
The notation Z n (-), W n (-), W n ,j(-), R n ,i(-) and Sk(-) will be used for the cor- 
responding quantities of the process X(-). We also employ the arrival rates 
at , mean service times mj , service rates fij and transition probabilities Pk,A 
that were introduced earlier, as well as the transition probabilities qu,A,n 
that are given in Corollary 3.1 for class independent and station indepen- 
dent JSQ networks. We will set e± = 1 — p\ for class independent networks 
and ei = 1 — pi for station independent networks, where pi are the traffic 
intensities. 

For x £ S, set 

N N K 

(4.1) |[a||x, = ^ \\x\\L,n, \\x\\r = IMkn, IMU = 'Y] \\ x \\A,k- 
n=l n=l k=l 

(The subscripts L, R and A are mnemonics for "left", "right" and "arrivals".) 
We define these individual components as follows. Since ||x||L,n is defined in 
terms of quantities obtained from ||a;||fl n , we define the latter first. 
For n = 1, . . . , N, set 



( 4 - 2 ) IMkn = m ^,,4, ^W(w n ,i), 

i=l 

where w n> i = {ij ni v n ^. The other components in (4.2) are defined as follows: 
(4.3) m k)A -- 



1 for class independent networks, 

mh a f° r station independent networks, 



and k n ^ and A n ^ denote the arrival stream and selection set for the i th job 
currently at queue n. The function tpw '■ ^+,0 ~~ * ^+,0 is required to be 
continuously differ entiable, with ipw(Q) = 0, ip' w (y) > 0, ip' w (y) / oo as 
y /* oo, and 



POO 

(4.4) / ihv(njy)Fj(dy) < e 2 for all j, 

Jo 



where e 2 = (ei) 2 /40. Since Fj(-) has finite mean, it is not difficult to choose 
such a ipwi')- 
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The norm || • \\r will be the main contributor to || • || for jobs with large 
residual service times; service of such a job (n,i) sharply decreases || • \\r 
when if)' w (w n ^) is large. The term m k ^ is needed because of the different 
definitions of the traffic intensities p\ and p 2 for class independent and sta- 
tion independent networks. 

For n = 1, . . . , N, we set 

( 4 -5) \\x\\ L ,n = ( Yl ™k n ,i,A nji A L 2 ) J Tpz(Zn)- 



Here, 

(4.6) 

and 

(4.7) Vz(y) 



ei + (e 3 /L 2 )log(y + 1) for y e [0, £ 3 ] , 
ei + (e 3 /L 2 )log(L 3 + 1) for y > L 3 , 



for a small e 3 > 0, which will be defined in (4.20) in terms of e\ and other 
quantities. 

We choose L 2 in (4.5) and (4.7) so that 

(4.8) ^' W (L 2 ) = 2Li 

for given L\, with L\ > 4 and L\ large enough so that L 2 > e2- We will 
specify L\ later as we find convenient. We choose L 3 so that 

(4.9) ^ Z (L 3 ) = ei + (e 3 /L 2 )log(L 3 + 1) = L i; 
it follows that when w > L 2 , 

(4.10) Vz(y) < ^Vw(«0 fo r a11 y- 

Note that L 2 — > oo and L 3 — >• oo as L\ — > oo. The inequality (4.10) is used 
in Proposition 4.1, and will tell us that, for large residual service times, the 
norm || • is "more powerful" than || • 

The following provides some motivation for the definition of ||x||i >n . The 
norm || • \\l will be the main contributor to || • || for jobs with moderate 
residual service times. The terms A L 2 and ipz{z n ) are each bounded, 
with the term decreasing continuously over time as the corresponding 
job is served; we employ rather than w Uj i, in (4.5), to ensure ||a;||i,n — > oo 
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as z n -> oo. The inclusion of the term ipz(zn), which is nondecreasing in z n , 
is reasonable since more jobs at a queue should correspond to a greater value 
of II - [U- 



For k = 1, . . . , K , we set 



(4.H) n*m, fc = (i+ ei /2) ]r [J2 Pk,AQk,A,n mk,A ipz(z n ) ijjA(sk)- 

A<ZB N \n=l / 

Here, qk,A,n is chosen as in the corollary to Lemma 3.1 and Sk = o-kUk- We 
require that Vm( - ) be locally Lipschitz on [0,oo), with 

(4.12) il) A {y) =M 1 -y for y € [0, Mi], 

and ip' A (y) > for y G (Mi,oo) and appropriate Mi > 1, with tp' A (y) /* oo 
as y /*• oo, so that 

/•oo 

(4.13) / (i> A (a k y) + a k y)Gk(dy) < e 2 . 

JMi/a k 

Since Gk(-) has finite mean, it is not difficult to choose such Mi and ipA{')- 
Because Gk(-) has mean 1/a^ and (4.12) is satisfied, (4.13) implies that 

/■OO 

(4.14) Mi - / ip A {a k y)Gk{dy) > 1 - e 2 , 

Jo 

which will be used in the proof of Proposition 5.1. 

The norm || • \\ A is chosen so that it interfaces properly with || • ||^: in 
between arrivals, ||X(i)||j4 will increase more slowly than ||X(t)||/, + ||X(i)||^ 
decreases; at an arrival, the average decrease of ||X(t)m will more than 
offset the increase in ||X(t)||x, + ||X(i)||ft because of (4.4) and (4.14), and 
the choice of £3 in (4.7). 

Note that the weighted ages are not employed in the definition of || • ||. 
They are not needed, in particular, to show petiteness of bounded sets in 
|| • || under (1.11) and (1.12), since they do not appear when z n = 0. (See, 
e.g., the end of Section 2.) 

In Section 2, we demonstrated Theorem 1.2 by employing Theorem 1.1 
and Lemma 4.1, with the latter asserting that || • || is continuous in the metric 
d(-, •). Having defined || • ||, we now state and prove the lemma. 

Lemma 4.1. The norm \\ ■ \\ given by (1.8) is continuous in the metric 
d(-, •) given by (2.2). 
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Proof. It suffices to show each of the norms || • \\ ■ \\r and || • \\a is 
continuous in d(-,-). The argument in each case is elementary. Noting that 
d(x, x') < 1 implies z n = z' n for all n, and that = w n ^ + is continuous 
in w n j, for given i, the continuity of || • \\l is not difficult to see. For the same 
reasons and since ipw(~) and V'a(') are locally Lipschitz, || • ||r and || • \\a are 
also continuous. □ 

Basic inequalities. In this subsection, we demonstrate Propositions 4.1 
and 4.2, which are at the foundation of the analysis in Sections 5 and 6. The 
evolution of X(t) between arrivals of jobs is deterministic. In Proposition 
4.1, we provide upper bounds on the rate of change of ||X(t)|| there by em- 
ploying its components ||X(£)||.r and ||X(t)m, which exist almost 
everywhere since the underlying functions are locally Lipschitz except where 
jobs arrive or depart, with jumps being negative at departures. We set 



(4.15) /*„ 



for class independent networks, 
for station independent networks, 



and note that 

(4.16) V n =rrik,A 

For Proposition 4.1, as elsewhere in the paper, we assume the network is 
either class or station independent. 

Proposition 4.1. For every subcritical JSQ network, 

\\x(t)\\' L + \\x(t)\\' R <\\x(t)\\' L + hx(t)\^ 

(4-17) _ o 

n 

for almost all t. Moreover, for any subset K, of {1, • • • , K}, 

(4.18) ^ (l + ei/2)p*E^^(£n(i)) 

for almost all t, for i = 1,2. Consequently, for almost all t, 

(4.19) ||X(t)|r < (ei/2) (2 - MM*)))- 
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The inequality (4.19) follows immediately from (4.17) and (4.18), with 
K. = {1, ...,K}, together with (1.8), since e\ = 1 — pi. The purpose of 
the term in the middle quantity in (4.17) is to permit a sharper 

upper bound, for large by including the contribution from ipw(') 

in (4.2). This strengthening of (4.19) will be applied in the proof of case (c) 
of Proposition 5.1. Also, since i/j'^(y) — > oo as y — > oo, the inclusion in the 
first sum in (4.18) of terms corresponding to certain k ^ tC can improve the 
bound on the right side, which will allow us to strengthen (4.19) for large 
\\X(t)\\A. 

Proof of Proposition 4.1. We need to demonstrate the first two dis- 
plays. For (4.17), we note that 

\\x{t)\\ L + \\\x(t)\\' R 

Z n (t) 

< "E ^« MM*)) E HW+M < L 2 }R n 4t) 

n i=l 
{ Zn(t) 
~ 2 E ^ E ^w(Wn,i(t)n{W+M > L 2 }Rn ti {t) 
n i=l 

Z n (t) 

<-£&» MZn(t)) E HW+M < L 2 }R n ,{t) 

n i=l 
Z n (t) 

MZn(t)) E HW^iit) > L 2 }R n ,(t) 

n i=l 

= -E ^ 4>z(Z n (t))l{Z n (t) > 0} < ]T (ei - ^z{Z n {t)) 

n n 

holds almost everywhere. The first inequality follows from the definitions 
of || • II • \\r and (4.16), since Z n (-) is constant almost everywhere. For 
this, note that W' ni {t) = —pj ni R n ^{t) almost everywhere. The second in- 
equality follows from (4.10), with the last inequality using ipzifi) = ei and 
Y,i R n ,i{t) = 1, when Z n (t) > 0. This implies (4.17). 

For (4.18), we apply the definition of || • \\A,k m (4-11) to obtain, for almost 
all t, 

ElW)H^<(l + ei/2)EEE 

&kPk,AQk,A,n 

k&K fce/C A n 

< {l + ei/2)p l Y J k l ^z{Z n {t)), 

n 
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where the first inequality follows from ip' A (y) > — 1. and the second inequality 
follows from the corollary to Lemma 3.1. □ 

We still need to specify the constant e 3 that was employed in (4.7); for 
later reference, we also recall the constant e 2 from equations (4.4) and (4.6): 

(4.20) e 2 = Mim ratio e 3 = (ei) 2 /40. 

Recall that Mi is specified in (4.12)-(4.13) and m ratl ° is defined as in the 
equation before (1.16). In Proposition 4.2, we show that, for this choice of 
€2 and €3, the expected increase in || • || is nonpositive at the time T of the 
first arrival of a job in the network. We note that for X(0) = x fixed, T is 
deterministic, as is the evolution of X(-) up through T— . 

Proposition 4.2. For every JSQ network, 

(4.21) E X [\\X{T)\\] < \\X(T-)\\ for all x. 

Proof. We consider the contribution to || • || from || • ||^, || • \\r and || • \\a, 
assuming that a single arrival occurs from stream k at time T; when arrivals 
simultaneously occur from other streams, the corresponding bounds can be 
applied sequentially. 

For || • ||l, one has 

(4.22) 

E X [\\X(T)\\ L ] - \\X(T-)\\ L = X)£j>M9Un 

A n 

Z n (T-) 

x E «,(r)AL 2 )te(2„(T-) + i)-fe(4(r-))) 

i=l 

POO 

+ E ^PkMl,A,n ™k,A ^z{Z n {T-) + 1) / (fay + e 2 ) A L 2 )F j (dy), 

An ^° 

for each x, where, for given k and A, A is the probability that the arriving 
job is assigned to queue n. As previously, j = (k,A,n). Because of the JSQ 
rule, A n is concentrated on the shortest queues in A. Since by (4.7), 

(4.23) 1>z(Z n (T-) + 1) - MZn(T-)) < e 3 /L 2 (Z n (T-) + 1) < e 3 /L 2 , 
the first term on the right side of (4.22) is at most 

(4.24) e 3 m ratio ^p M m M . 
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On the other hand, since 

poo roo 

/ (fay + e 2 ) A L 2 )F J (dy) < ^ / yFj(dy) + e 2 = 1 + e 2 , 

JO JO 

which does not depend on n, and since ipz(y) is increasing in y and qk,A,n 
is concentrated on A, the last term on the right side of (4.22) is at most 

(4.25) (1 + e 2 ) ^2^2p k ,A<lk,A,n ™k,A ^z{Z n (T-) + 1). 

A n 

In other words, removing the truncation by L 2 in the integral and replacing 
1*k A n (which is concentrated on the shortest queues in ^4) by q k) A,n can only 
increase this term. Note that this is the only place in the first six sections of 
the paper where the JSQ property is employed. It follows from the bounds 
(4.24) and (4.25) that 



E X [\\X{T)\\ L ]-\\X{T-)\\ L 



(4.26) 



< (1 + e 2 ) Y,Y,P k ^ A > n ™M (*l>z(Z n (T-) + l)+m ratio e 3 ). 

A n 

For || • \\ R , it follows from (4.2) and (4.4) that 
E X [\\X(T)\\ R )-\\X(T-)\\ R 



(4.27) 



= y2y2pk,A<i*k,A,n™k,A / ipw(vjy)Fj(dy) 

An J ° 
< £2^2pk,A ™>k,A ■ 



A 

On the other hand, it follows from (4.11) that 

E X [\\X(T)\\ A )-\\X(T-)\\ A 

< (1 + e 1 /2)(e 3 /L 2 )M 1 V ft , A m kA 
(4.28) a 

- (1 - e 2 )(l + ei/2) ^Pfc,A%,A,n ™M ipz(Z n (T-) + 1). 

A n 

In the first term, the factors e3/X 2 and Mi are due to (4.23) and (4.12), 
since V'a(O) = M±, and in the second term, the factor 1 — e 2 is due to (4.14). 
Combining (4.26)-(4.28), it follows that 

E X [\\X(T)\\] - \\X(T-)\\ < (Ul 1 m^ io e 3 + e 2 )Y / Pk,A m M 

A 

- [(1 - e 2 )(l + ei/2) - (1 + e 2 )] ^2p k ,Alk,A,n ™k,A ipz{Z n {T-) + 1). 

A n 



28 



MAURY BRAMSON 



Since ipz(y) > ^i for all y, it follows from (4.20) that this is at most 

-( e i/ 8 )^2Pk,A m KA < 0. 
A 

So 

e x [\\x{t)\\] < ||x(r-)||, 

as desired. □ 

The following upper bound is a consequence of Propositions 4.1 and 4.2 
and the strong Markov property. It will be applied in the proof of Theorem 
5.1. 

Corollary 4.1. For every subcritical JSQ network, 
(4.29) E x [\\X(t)\\]- \\x\\ <C 2 t for all t and x, 

O 

where C 2 = J2 n *V 

Proof. Denoting by Ti,!^, ... the times at which arrivals occur and 
applying the strong Markov property, one can repeatedly apply Propositions 
4.1 and 4.2 over the intervals (0, T\ At], {T\ A t,T 2 A t], . . .. Over each such 
interval, it follows from (4.19) and (4.21) that 

E x [\\X{T l+l At)\\]-E x [\\X(T t A t)||] 

(4.30) 



< £»J K A(T i+ i At)- {Ti A t)] 

for each i, since ipz(y) > f° r au V- Summing over % gives 

£ x [||X(i)||]-|M|<t^£„, 

n 

and hence (4.29). □ 

5. Proof of Theorem 1.1. In this section, we demonstrate Theorem 
1.1. The proof is organized as follows. We first show it suffices to demonstrate 
Theorem 5.1, which is a slight variant of Theorem 1.1. The demonstration 
of Theorem 5.1 is then reduced to showing Proposition 5.1. The inequality 
in the proposition is expressed in terms of expected values of the norm || • || 
at stopping times a that will be introduced shortly. Most of the rest of the 
section is devoted to showing Proposition 5.1, for which Propositions 4.1 
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and 4.2 are employed. At the end of the section, we briefly discuss a simpler 
alternative approach that can be applied to certain service disciplines; we 
also mention analogs of Theorem 1.1 where the state space is modified. We 
then discuss the stability of the JLLQ rule. 

In order to demonstrate Theorem 1.1, we need to verify the inequality 
(1.10). In Theorem 5.1, we will instead demonstrate the variant (5.1). We 
recall that 

r M = inf{t : \\X(t)\\ < M}. 

Theorem 5.1. For each subcritical queueing network satisfying (1.7), 
there exists M so that 

(5.1) E x [tm] < C 3 \\x\\ for all x, 

where \\x\\ is the norm given in (1.8) and C 3 = ^ei ^2 n M n J 

The inequality (1.10) follows quickly from Theorem 5.1 and Corollary 4.1. 
By (4.29), 

£y||X(l)||] < ||x|| +C 2 for all x, 

o 

where C 2 = J2 n ^n- Restarting the process at time 1 and applying (5.1) to 
x' = X(l) implies that 

E x [r M (l)]<C 3 (\\x\\+C 2 ) + l, 

which implies (1.10) with C\ = C 3 V (C 2 C 3 + 1). 

The proof of Corollary 4. 1 did not require any conditions on the evolution 
of X(-). In order to demonstrate Theorem 5.1, we need to consider the 
behavior of X(-) when its norm is large. If Z n () is uniformly large over 
an interval for some n, we will be able to apply (4.19) of Proposition 4.1. 
If either ||X(-)||^ or HX^m is large, we will be able to employ versions 
of (4.17) and (4.18). In each of these cases, we also apply Proposition 4.2. 
Iteration of these bounds and application of the strong Markov property as 
in the proof of the corollary will then imply the theorem. 

In order to demonstrate (5.1), we need only consider ||x|| > M. On ac- 
count of (1.8), ||x|| > M implies that, for given Ml, Mr and Ma with 

(5.2) M = M L + M R + M a , 

either (a) > Ml, (b) < Ml and \\x\\a > Ma, or (c) < 

Ml, \\x\\a < Ma and > Mr. We will analyze these three cases for 

appropriate Ml, Ma and Mr. 
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Denote by T the time of the first arrival in the network. We introduce the 
stopping time a, where 

(5.3) a = mf{t:\\X(t)\\ L < M L } AT 
for x satisfying (a) and 

(5.4) a = mi{t:\\X(t)\\ A <M A }AT 
for x satisfying (b). For x satisfying (c), we set 

(5.5) a = t x AT^ T+1 . 

Here, t x is deterministic and will be defined in (5.23); T is given in (1.7). (As 
mentioned there, T = for many applications.) The term T' x i is the time of 
the i th arrival at the queue n x , with n x being specified just before (5.23). 
We also set 

{o o 
max n ^ n / min n ^ n for class independent networks, 
1 for station independent networks, 

and recall L\, which was introduced in (4.8). 
We will show 

Proposition 5.1. For each subcritical JSQ network satisfying (1.7) and 

for L\ satisfying L\ > AN fi ratw ! there exist Ml, Mr and Ma, so that for x 
satisfying either (a), (b) or (c), and a chosen as in (5.3)-(5.5), 

(5-6) ^.H<C 3 (|M|-^[||A>-)||]), 

where C3 is as in Theorem 5.1. 

Recall that ||X(-)|| has negative jumps at departures. It therefore follows 
from Proposition 4.2 and the strong Markov property that 

(5.7) E x [\\X(a)\\]<E x [\\X(a-)\W for all x. 

Together with (5.6), this implies 

(5-8) E x [a]<C 3 (\\x\\-E x [\\X(a)\\}) 

for x chosen as in the proposition. 

It is not difficult to demonstrate Theorem 5.1 by iterating (5.8) and ap- 
plying the strong Markov property. 
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Proof of Theorem 5.1 using (5.8). Iteration of (5.8), by applying 
the stopping rule a at each step, induces a sequence of stopping times 

< a\ < a 2 < ■ ■ ■ , 

with the sequence stopping at 07 if ||X(cr/)|| < M. Repeated application of 

(5.8) , together with the strong Markov property, implies that for each i < I, 

(5.9) E x [*i] < C 3 (\\x\\ - E x [\\X(*i)\\]) < C 3 \\x\\ 

for all x. (The sum of the bounds obtained from the right side of (5.8) forms 
a telescoping series.) On the other hand, over every finite time interval, 
there are only a finite number of arrivals and, in between arrivals, only the 
norm || • \\a can increase; hence only a finite number of stopping times can 
occur over a finite interval. It therefore follows from (5.9) and tm < 07 that 
a 1 < 00 almost surely, with 

(5.10) E x [t m ] < E x [ai] < C z \\x\\ for all x. 

This implies (5.1) of Theorem 5.1. □ 

Most of the rest of this section is devoted to demonstrating Proposition 
5.1. To do so, we consider separately the cases (a), (b) and (c) that are given 
after (5.2). Cases (a) and (b) will be easy to show; case (c) requires more 
effort. We employ the notation 

(5.11) m max = max fc>j4 m M . 

o o o . o 

For later use, we also set m mm = min/^ vrik,A an d ^ mm = min n H n . 

Proof of Case (a) of Proposition 5.1. Under the condition (a), for 
each t G [0, a), there exists an n(t) so that ||-X'(t)||x, n (t) > Ml/N, and hence 
by the definition of || • \\L, n and by (4.9), 

Z n(t) (t) > M L / (ni^L 2 N^ z (Z n{t) (t))) > Mi/ (m^L^N) . 

Setting 

(5.12) M L =m™ x L 1 L 2 L 3 N, 

o 

it follows that Z n u\(t) > L 3 , and hence tpz(Z n (t\(t)) = L\. Since L\ > 4N A* 
ratio ^ ft follows from (4.19) of Proposition 4.1 that, almost everywhere on 
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(0,a), 

\\X(t)\\' < -( Cl /2) £ £ n {^ Z {Z n {t)) - 2) 

n 

< -ei ^ M n ■ 

n 

This implies case (a) of (5.6). □ 
We next demonstrate case (b). 

Proof of Case (b) of Proposition 5.1. Under the condition (b), for 
each t £ [0, <r), there exists a so that ||^(i) m,fc(t) > Ma/K, and hence 
by the definition of || • \\a,u an d by (4.9), 

(5.13) MS k (t)(t)) >M A / (^m^LiK) . 

Choose yi > Mi large enough so that 



Em 

\ n / 



/mm(a k m k A) 

and Tp' A (y2) > ^aCz/i) for 1/2 > 2/1 ; this is possible since tpAiv) / 00 as 
y co. Setting 

(5.15) M A = 2 m^L^Aiyi), 

it follows, from (5.13), that 4>A(Sk(t)(t)) > ipA{yi) for each i, and hence that 
Sk(t)(t) > Vi an d ^(^(^(0) ^ V'A(yi)- ^ or as m (5-15), differentiation 
of ||X(-)m,fc using (4.11) and the lower bound e\ for tpz(y) therefore imply 
that, at k = k(t) with t E [0, a), 

(5.16) ||^(*)||A,jfe - ~ £ 1 Yl ^2Pk,AQk,A,n<Xk m k,A V4(2/l); 



which by (5.14) is at most — 2ei Yin ^n- 

We apply (4.18) of Proposition 4.1, with K, equal to the complement of 
{k(t)}. Adding (4.17) to (4.18), one obtains the analog of (4.19), but with 
the additional term inherited from (5.16), namely, almost everywhere on 
[0,a), 

(5.17) ||X(t)||' < -(ei/2)^A„(2 + ^(Z n (t))) < -e^K ■ 

n n 

Case (b) of (5.6) follows. □ 
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We now begin the argument for case (c) of Proposition 5.1. This is the 
only part that requires the condition (1.7). It requires more work than the 
other two cases since, when ||x||_r is large, we need sufficient service of some 
job (n, i), with large w n ^, to ensure a rapid decrease of || • ||. Since any service 
discipline is allowed, such service need not occur at all, or even most, times. 
We will instead show that for ||x||l < Ml, with Ml not too large, and 
\\x\\r > Mr, with Mr enormous, there is a small (but, nevertheless, large 
enough) probability that a job (n, i) with enormous w n ^ receives sufficient 
service so that the corresponding derivative ip' w (w nt i) induces an enormous 
decrease over (0, a) of || • \\r, and hence of || • ||. In particular, sufficient service 
of such a job (n, i) must occur if (0, a — m max ] is sufficiently long to allow 
complete service of all jobs at n with smaller weighted residual service times. 
(Recall that m max = maxj nij.) 

We will identify the job (n,i), with "large w n / J , that was referred to in 
the last paragraph in terms of a rapidly increasing sequence w(l),w(2), 
This sequence will also be used to define t x , which was used in the definition 
of a in (5.5). We construct w(i) and corresponding sequences p(0),p(l), . . . 
and t(0),t(l), . . . inductively. 

Set 



(5.18) p(i) = h{t(i) + 1), 

where h(-) is given in (1.7). We choose w(i) and t(i) so that 

(5.19) ifa{w(i) - 1) = C A 2 i+T+2 {t{i - 1) + 1) /p(i - 1) 
and 

i 

(5.20) t(i) = J2 w W + 2T - 



Here, C4 = J2 n H n / ^ mm , ipw(') and T are as in (4.2) and (1.7), respectively. 
(Note that, on account of (4.4), the range of i/j' w (-) contains [l,oo), and so 
(5.19) can always be solved for w(i) and given i.) Often, p(i) will decrease 
very rapidly to 0, and w(i) and t(i) will increase very rapidly. The factor 2* 
in (5.19) is not required for the proof of Proposition 5.1, but will be used in 
the next section. Employing w(i) and Ml, we set 



(5.21) M R =m max Nilj w 
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where C5 = 1/ [eie2m m 

The sequence w(i) was defined in (5.19) with Lemma 5.1 in mind. For the 
lemma, we relabel the residual service times w n j, i = 1, . . . , z n , in order of 
increasing value at each n, employing the notation w' n x , w' n 2 , . . . , w' n Zn . We 
employ k' n i and A' n i for the corresponding renewal streams and selection 
sets. 

Lemma 5.1. Suppose that for given n, \\x\\n tn > Mr/N and z n < C$Ml- 
Then, for some i = 1, . . . , z n , w' n i > w(i). 

Proof. If w' n i < w(i) for all z = 1, . . . , z n , then 

[C 5 M L i 

\\x\\R,n = ™K,irK,i ^ W ^ W 'n,i) <^ maX 1p W (w(i)) 
i=l i=l 

(5.22) 




which is a contradiction. The first inequality holds since z n < C$Ml, the sec- 
ond inequality since Vw( - ) is convex with Vw(0) = 0, and the last inequality 
since ||x||fl in > Mr/N. □ 

Suppose that for a given x G S, \\x\\R tn > Mr/N and z n < C§Ml for 
some queue n, which we denote by n x . (In case of more than one such n, 
choose one of them.) As before Lemma 5.1, denote hy w' n i , i = 1, . . . , z nx , 
the ordered sequence obtained from w nxi i, i = 1, . . . , z nx . We set 

(5.23) t x = m m ^(t(i x - 1) + 1), 

where i x is the smallest index i at which w' „• > w(i). On account of Lemma 

fix 1 1> ^ ' 

5.1, such an index exists. The time t x is used to define a in (5.5) in case (c). 

Using the preceding construction, we now complete the proof of Proposi- 
tion 5.1. 

Proof of Case (c) of Proposition 5.1. Under case (c), \\x\\r > Mr, 
and hence > Mr/N for some queue n. Moreover, since ||x||z, < Ml, 

(5.24) z n < \\x\\ L , n / ( ei e 2 m min ) < M L j (e ie2 m min ) = C 5 M L , 



with (4.6) and (4.7) being used for the first inequality. So, the assumptions 
of Lemma 5.1 are satisfied, and n x , t x and i x can be defined as immediately 
following the lemma. 
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Let B x denote the event where at most V potential arrivals occur at n x 
by time t x and their service times are each at most 2m max . One can check 
that 

(5.25) P x (B x ) > 2~ T h (t x /m max ) = 2~ r p(i x - 1), 

where h(-) is as in (1.7). For the inequality, note that the probability of a 
service time being at most 2m max is at least 1 /2, and that the probability this 
occurs for all service times for up to T jobs is at least l/2 r , after conditioning 
on knowing the arrival streams and selection sets of the jobs. The equality 
follows from (5.18) and (5.23). 

On B x , a = t x . Moreover, by time t x , the total service devoted to jobs at 
n x , with i > i x , is at least m max , since the total time required to serve all 
initial jobs i, with i < i x , and the at most T arriving jobs is at most 



ig— 1 f ix— l \ 

(5.26) m f nx t w L,i + 2rm max < m max ^ w(i) + 2r\ = t a 

i=l \ i=l / 



m max , 



for j' n i defined analogously to k' n i and A' n ^, where the equality follows from 
(5.20) and (5.23). Since ip' w (-) is nondecreasing, it follows that, on B x , 

IMkn, - \\X(t x -)\\ R , nx > m max °^'w - 1) 

(5.27) 

> m max h™ n ^ w {w{i x ) - 1), 

Consequently, 
(5.28) 

E x [\\x\\ Rjnx - \\X(a-)\\ R , nx ;B x ] = E x [\\x\\ R>nx - \\X(t x -)\\ Rttlx ; B x ] 

> m max » mi *2- r p(i x - \W w {w{i x ) - 1) 
= C 4 ^2^+% > AC A °V min t x , 

with the first inequality following from (5.25) and (5.27), and the last equal- 
ity following from (5.19). 
On the other hand, 



(5.29) 



Ex [\\x\\ - \\X(a-)\\] - \E X [11*11^ - \\X(a-)\\ R , nx ;B x ] 
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To see this, note that, on intervals between arrivals, ||X(t)||fl >n is decreasing 
for all n, and so 



(5.30) 



\X(t)\\-±\\X(t)\\ R , nx l{ueB x } 



< 



\X 



V 

2" 



almost everywhere. But at the time T, of an arrival in the network, 



(5.31) 



< \\X(T^\\-\\X(T V 



-,\\X(Ti—)\\R inx l{u) G B x } 



since ||X(T,)||^ > ||X(TH|k^. 
One obtains 



(5.32) 



\X 



1 



X(t)\\ R>nx i{co e B x } 



< 



by applying (4.17) and (4.18) of Proposition 4.1 to the right side of (5.30) to 
get the analog of (4.19) (with ^H-X^i)!!^, rather than being applied 

in (4.17)). Also, by applying Proposition 4.2 to the conditional expectation 
with respect to T^- of the right side of (5.31), one obtains 



E , 



(5.33) 



||JSr(r i )||--||X(r i )|| fl ,^l{a;€B x } 



\X(T i -)\\--\\X(T i -)\\ R , nx l{ojEB x } 



One then obtains (5.29) from (5.32) and (5.33) and the strong Markov prop- 
erty by arguing as in the proof of Corollary 4. 1 . 
It follows immediately from (5.28) and (5.29) that 

11*11 - E x [\\X(a-)\\] > 2C 4 °^t x - ^ E x [a]. 

Since C 4 = (En ^n)/ V min and t x > a, this implies 

II ^ II - E x [\\X(a-)\\] > \ J2 ^ E M, 



which demonstrates case (c) of (5.6) of Proposition 5.1. 



□ 



STABILITY OF JOIN THE SHORTEST QUEUE NETWORKS 



37 



We note that, in some instances, the proof of case (c) of Proposition 
5.1 is not needed, or can be simplified. For instance, if all of the service 
distributions have bounded support then, on account of (5.24), case (c) will 
be vacuous if Mr is chosen to be a large enough multiple of Ml, and so the 
proofs of the first two parts suffice. 

Suppose, instead, that the service discipline is PS. Then, at any time t 
and queue n with Z n (t) < Ml, \\X(t)\\n >n decreases at rate 

° z n (t) o 
If Ml is fixed and Mr is chosen large enough then, for ||X(i)||# )n > Mr, 

o 

the right side of (5.34) is at least 4^ n , M n / for some n. One can then argue 
analogously to the proof of case (b) that ||X(i)|| decreases at least at rate 

o 

Y2n' ^n' f° r > Mr, which will imply (5.6) in case (c) as well. One 

can employ a similar argument for the FIFO service discipline, although one 
first needs to redefine the state space S so as to suppress information from 
the state space descriptor on the service times of jobs for which service has 
not yet begun. In none of these instances is the assumption (1.7) used, since 
lower bounds on the service effort devoted to jobs with large residual service 
times always hold. (Such lower bounds will not in general hold for LIFO and 
certain other standard service disciplines.) 

When the interarrival times of a queueing network are exponentially dis- 
tributed, one has the option of removing the residual interarrival times from 
the state space descriptor of the process X(-). The resulting process X'(-), 
which takes values x' in the corresponding space S', will also be Markov 
under service disciplines that do not employ the omitted information. It is 
easy to see that positive Harris recurrence of X(-) implies the same for X'(-); 
its equilibrium is the projection of the equilibrium of X(-). If one wishes, 
one can instead demonstrate the analog of Theorem 1.1 directly for X'(-) 
by employing the norm || • ||', with 

(5.35) \\x'\\' = \\x'\\ L + \\x'\\r, 

where || • \\l and || • \\r are defined as in (4.5) and (4.2). The proof simplifies 
somewhat, since one can combine Propositions 4.1 and 4.2, and one only 
requires the bounds (4.17), (4.26) and (4.27) given there. Since the state 
with no jobs is accessible and petite, positive Harris recurrence follows from 
this version of Theorem 1.1. One can also show the analog of Theorem 1.3, 
which is proved in the next section. 
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As mentioned in Section 2, one can enrich the space S by introducing 
other coordinates in the state space descriptor, such as (a) the elapsed time 
since the last arrival from each stream, or (b) the amount of service already 
received by each job. The arguments we have given for Theorems 1.1 and 
1.2 remain the same in these settings, since the randomness of the networks 
- due to future interarrival and service times, and choices of the selection 
set - is not affected; nor is the norm || • ||. Theorem 1.3 can also be extended 
without difficulty to these settings, with the corresponding quantities being 
included in (1.24) and (1.25) if desired. It is easy to see from the summary 
of its proof that Corollary 1.1 holds under the enrichment (b); Corollary 1.1 
also holds under (a), although an extra step is required at the end of the 
proof to match up the new coordinates. Similar comments hold for Corollary 
1.2. In each of the above cases, when the interarrival times are exponentially 
distributed, one can simultaneously include new coordinates as in (b) while 
removing the coordinates corresponding to the residual interarrival time 
from the state space descriptor, as in the previous paragraph. 

JLLQ networks. At the beginning of Section 1, we briefly mentioned 
JLLQ networks, where jobs are assigned to the queue with the smallest 
workload, that is, to the queue n where Y17=i v ^,i is smallest. When two 
or more queues have the smallest workload, one of these queues is chosen 
according to some rule. The traffic intensities pi, i = 1,2, are defined as in 
(1.2) and (1.3) for the class and station independent cases, and the network 
is said to be subcritical if pi < 1. 

The stability of the network is not affected by the (non-idling) service 
discipline, since the evolution of the workload at a queue is not affected. 
Analogs of Theorems 1.1 and 1.2, and Corollary 1.1 for the stability of 
subcritical JLLQ networks hold, as do the uniform bounds in Theorem 1.3 
and Corollary 1.2, with the assumptions (1.7) and (1.17) no longer being 
needed. 

Stability of subcritical JLLQ networks is intuitively more obvious than for 
subcritical JSQ networks, since the system cannot be "tricked" into sending 
jobs into queues with high remaining work. Stability is also easier to show. 
In [12], it was shown by using the same argument involving fluid limits that 
was used for JSQ networks with the FIFO service discipline. One can also 
show stability, as well as uniform bounds on the marginal distributions, with 
the aid of an appropriate norm that satisfies the analog of Theorem 1.1. We 
do not supply a proof here, but provide some motivation. 
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Such a norm || • || is given by 



N 



(5-36) \\x\\ = ^2^(g n ), 

n=l 

with 

Z„ K 

(5.37) g n = ™k nA ,An,i w n,i + ( 1+e l/ 2 ) £ X] Pk,Aqk,A,n ™k,A Vu(sfc)- 
i=l fc=lAC.B J v 

The quantities ui+, ei, Pfc,Ai Qk,A,m s k and V'aO) are the same as those 

that were employed in Section 4 to define the norm || • || there. The function 
ij) : M +i o — > K+,o is required to be twice continuously differentiable, with 
^(0) =' 0, i>'{yj > 0, tf(y) -> oo as y -> oo, <//'(y) > 0, V"(y) -> as 
y — > oo, and 

/■oo 

/ ip(y)Fj(dy) < oo for all j. 
Jo 

The first sum in (5.37) plays a role similar to || • \\L, n for the JSQ rule, and 
the double sum in (5.37) plays a role similar to || • \\a- (When the interarrival 
times are exponentially distributed, one can remove them from the state 
space descriptor, and omit the double sum.) The function ■ip(-) has been 
chosen so that for large ||X(i)||, < —Cq for given Cq > 0. Since 

ij}'{y) — > oo as y — > oo, the idling at empty queues does not affect this bound 
except in the computation of the constant. Also, since ip"(y) — > as y — > oo, 
ip'(y) is "almost constant" locally for large y, which can be employed in 
conjunction with the subcriticality of the network to induce a negative drift 
for ||X(i)|| at large values. We omit the details. 

6. Uniform bounds on families of JSQ networks and tightness. 

In Theorem 1.2, we demonstrated positive Harris recurrence for the Markov 
process X(-) for subcritical JSQ networks, provided the sets Am given there 
are petite. Such a network therefore has an equilibrium probability measure 
£. Here, we consider the equilibria f", a £ A, of families A of such net- 
works, and demonstrate the uniform bounds on the tails of these equilibria 
that are given in (1.24) and (1.25) of Theorem 1.3. The proof of (1.24) oc- 
cupies most of this section. After proving the theorem, we then apply it to 
show tightness and relative compactness of the marginal distributions under 
additional assumptions on the service disciplines. 
We introduce the notation 

7 | a )(M) = S^(Z 1 > M), i { f\M) = £( fl )(4 a) > M), 



(6.1) 



7 



( fl )(M) =£ ( - a \wi a) > M) 
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and 

(6.2) li a) {M) = maxS^(S[ a) > M). 

k 

The quantities z n , 6n \ w^f 1 and were denned in (1.19)-(1.22); Z n , C { n\ 
Wi a) and S[ a) are the corresponding random variables. (Recall that Cn is 
employed to avoid possible confusion with the constants Lj.) On account of 
the symmetry condition (1.23), the probabilities in (6.1) do not depend on 
the specific queue n that is chosen; we denote by z\, and the coor- 
dinates of a particular queue. To demonstrate (1.24) and (1.25) of Theorem 
1.3, it suffices to show that 

(6.3) 7 i a) (M), 7 , (a) (M), 7 W(M), 7 ( a )(M) -> as M -> oo, 

uniformly in a £ A. 

The argument for 7^ (M) is elementary. The arrival flows are renewal pro- 
cesses with densities (dy) . The densities of the corresponding stationary 
distributions are therefore otkyG^\dy). Substituting this into (1.14) implies 
7s (M) — > uniformly in a as M — >• 00, as desired. 

The main idea in showing (6.3) for ^\m) and ^fw\M) will be to employ 
the bounds from Sections 4 and 5, for large II • ||i a ^ and II • II ^ , to show that if 

at a given queue n either the queue length z n or the weighted workload 
is large with nonnegligible probability with respect to a given initial measure 
is, then .E„[||X( a )(7;)||' a )] will decrease over an appropriate time interval. Since 
the measures £^ are stationary, this behavior will provide a contradiction 
unless (6.3) holds for both 7* (■) and 7L (■)■ We first demonstrate (6.3) for 
7« (■), which is not difficult. The argument for Tu^(-) is more involved, and 
relies on the argument for || • ||# n in Section 5. 

The limit in (6.3) for 7^ a ' ) (M) follows without difficulty from that for 

7^(M). The basic idea is that jobs will not have time to age significantly 
before the queue empties, if the workload is typically low. We present this 
argument next. 

Proof of (6.3) for ^j, a \-)- It suffices to show that, for each 6 > 0, 

(6.4) jl a) {M)>5 => M<Mx{8) 

for some function M\{-) that does not depend on a. The argument involves 
partitioning the time interval [0, g^M], = f ( m max )( a ) 5 [ n i subintervals 
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of length g^M 2 (5), for appropriate M 2 (8). One applies (6.3), for 7™ (■), 
and (1.17) to obtain lower bounds on the probability that a given queue is 
empty sometime on such an interval. Appropriate events corresponding to 
the intervals will be disjoint, and adding their probabilities will then imply 

(6.4) . 

We choose M 2 (-) so that 

(6.5) S u V1 ^{M 2 (5)-2T)<5/2, 

where T is as in (1.17). Also, denote by a\ (6), i = 0, 1, 2, . . ., the events on 
which at most V potential arrivals occur at the queue over (ig( a >M 2 (5), (i + 
l)g^ M 2 (5)], with each such arrival having service time at most 2g <ya \ As in 
(5.25), 

(6.6) P x (A < f ) (5))>2- r h(M 2 (S)) 

for all x and a. If we choose M so that 7^(M) > 5 for a given 5 and a, it 
follows from (6.5), (6.6) and the stationarity of £^ that 

/ P £W (cf\iM 2 {5)) > M, w[ a \iM 2 {5)) < M 2 {5) - 2T; (5) 

(6.7) v 

> 52- T ~ 1 h(M 2 (5)). 

If at some time r E [ig^ M 2 (5), g^ M], z[ a) (r) = holds then, for each 
t G [r, g( a ) M], it is immediate that C\{t) < M. On the other hand, under the 
event B^ a \d) on the left side of (6.7), the total amount of work initially at or 
entering the queue over [ig( a > M 2 (8), (i+l)g( a > M 2 (6)] is at most g( a \M 2 (5) - 
2T + 2r) = M 2 (5), which implies it will be empty at some time in 
[ig^M 2 (5), (i + l)g^M 2 (8)]. Hence, on £,J a) (<5), 

(6.8) C[ a \t) < M for t€[(i+ l)<? (a) M 2 (5), g {a) M\. 

It follows from (6.8) and the definition of B\ a) (5) that, for 7^ a) (M) > 6 

and M > IM 2 (6), I G Z+, the events B^tf), i = 0, 1, are disjoint. 
Taking their union, it follows from (6.7) that 



(U S « (B) W) >I52^h(M 2 {5)). 

Consequently, I < 2 r+1 /{5h(M 2 (5})), and so, under -ff> \M) > 5, 

M < (2 T+1 /(8h(M 2 {5))) + 1) M 2 {5) = Mi(<5), 
which does not depend on a. This implies (6.4). □ 
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Demonstration of (6.3) for 7« (■). The uniformity conditions (1.13)- 
(1.16) ensure that the norms || • ||", a G A, can be defined, for a given 
Li, by choosing the quantities ipz(-), ipw(-), ^a{-), ^2, L3, Mi, ei, e 2 and 
€3 from Section 4 so as not to depend on a. The uniform bounds obtained 
from these choices will be applied to show (6.3) for both ^/ ( f\-) and 7^ (•). 
The term L\, L\ > 4, should be thought of as a free variable; when showing 
(6.3), we will let L\ — > 00. 

To see the above claim on the choice of these quantities, first note that 
there exist functions ipw(') and iPa(-) satisfying the regularity and mono- 
tonicity properties given after (4.3) and (4.11), with iPa(') satisfying (4.12) 
for appropriate Mi, so that the following analogs of (4.4) and (4.13) hold: 



(6.9) sup max / ip w (fi^ y) F a) (dy) < e 2 

a&A 3 JO 

and 

(6.10) sup max f°° U A {p$V) + 4°M G { k a) (dy) < e 2 , 

where e 2 = (ei) 2 /40 as in (4.20); e\ is determined by the left side of (1.15). 
Both inequalities follow without difficulty from (1.13) and (1.14); the uni- 
form limits of the tails in (1.13) and (1.14) permit the choice of ipw(') an d 
iPa(-) as in (6.9) and (6.10), with i>' w {y) /* 00 and ip' A (y) / 00 as 1/ / 00. 

As in (4.20), set e 3 = (ei) 2 /(40Mim ratio ), but with m ratio given by (1.16). 
For given L\, L 2 and L3 are chosen as in (4.8) and (4.9). The function V'z(') 
is then specified in (4.7), using these choices of ei, €3, L 2 and L3. Employing 
these quantities, one defines || • ||^, || • ||^ and || • ||^ as in (4.2), (4.5) and 

(4.11) , and || • ||( a ) as in (1.18). These norms will depend on a in general. 
Rather than deal directly with || • we need to employ a truncated 

version in order to guarantee that its expectation with respect to 8^ is 
finite. We denote by || • ||( a,il ) the norm on for given L\ and, by 

(6.11) \M {a,t) = f IM| (a,il) A M for x G S (a) , 

its truncation at a given value M, with t = f (Li, M). (If the expectation of 
|| • ||( a ' L i) with respect to £^ a \ a G A, is finite, one can set M = 00.) Since 
£^ a \ a € A, is invariant, 

(6.12) E eW \\\X^{t)\\^} = E £(a) \\\X& (0)||^1 for all t. 

As in Section 5, we denote by T the time of the first arrival in the network. 
Here and later on, when the context is clear, we drop the superscript (a) for 
quantities such as X(-) and Z n (-) that are associated with the networks. 
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(6.13) 



E £ (a) 



|X(0)|| (a/) - \\X(t)\\^ 



into two parts, depending on whether T > t or T < t. For small t, the part 
with T > t will contribute the main term. For ||x||^ a,il ^ < M and given L\ 
and M, one has 



E x \\x\\ ia ' e) - \\X(t)\\^- T > t 

> E, 



.14) 



> E x 



> E r 



(ei U a) /2) V f ^z{Z n {t')) - 2) dt'; T > t 

n J 

(e 1 U a) /2) V ( (Li/2) T l{Z n (0 > L 3 K 
2^ l{Z n (t') < L 3 }dt'^j ; T > t 



with the first inequality holding since ||Jf(t)||(°^) < ||X(t)||( a ' Ll ), the second 
inequality following from (4.19) of Proposition 4.1, and ipz(L 3 ) = L\ > 4 

being used for the third inequality. Here, l^ a > = /V , which is constant in 
n by assumption. 

On the other hand, for fixed a and Li, 

£ {a) f||X|| (a ' Ll) > M] -> asM^oo. 



Also, ||X(0)||( a '^ > \\X(t)\\^ for ||X(0)||( a ' Ll ) > M, and the quantity 

inside E x [-] for the last term in (6.14) is always at most (ei 1^°^ /4)L\ N^ a H. 
It therefore follows from (6.14) that 



E £ {a) 



|X(0)||( a '^ - \\X(t)\\^;T>t 



(6.15) 



> E g ( a ) 



(ei£< a >/2) [{Li/2) l{Z n (t') > L 3 }dt' 



2 / l{Z n (t') < L 3 }dt' ) ; T > t 



LiN^tS^ (M) 



for appropriate 6^\-), with Sf'^M) \ as M oo. 



(«), 
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We now set M = L3, for M in (6.3). Since Z n {-) has right continuous 
sample paths and so cannot immediately fall below L3 if Z n (Q) > L3, the 
right side of (6.15) is at least 

N^t ( ei U a) ((Li/4) 7 W(L 3 ) - l) - L#(M) - L^\t)) 

for appropriate (^(O' with 5^ (t) \ as t \ 0. Hence, for large enough 
M and small enough i, 



(6.16) 



£(<0 



|X(0)||^ - \\X(t)\\^; T > t 

>e l °^N^t((L 1 /A) 1 < f\L z )-2 



On the other hand, by employing (4.19) of Proposition 4.1, Proposition 
4.2 and the strong Markov property, it follows that 
(6.17) 



|X(0)|| (a ' £) - ||X(t)||( a ' £ ); T<t > -ciM (o) JV (a) tPg(«)(T < *)■ 



One can see this by considering the above difference over [0, T] and [T,t], 
and, for the second part, arguing as in the proof of Corollary 4.1. 

Combining (6.16) and (6.17), and choosing M large enough and t small 
enough, one obtains 
(6.18) 

E £M [||X(0)||( a < £ ) - > ei U a) N^t ((Li/4) 7 W(L 3 ) - 3) . 

By the invariance of £^ a \ the left side of (6.18) is zero, and so 

(6.19) 7 W(L 3 ) < 12/Ll 

The limit (6.3) follows by letting L\ — > 00. □ 

Demonstration of (6.3) for (■). On account of the limiting behavior 
in (6.3) for 7 | a ^(-), it suffices to instead show that 

(6.20) l^ a) {M) as M -> 00 
uniformly in a for 

(6.21) 



W<(M) = £^ a \w[ a) > M\Z X < M), 



where M 1 is a function of M with M' — > 00 as M — > 00. We employ the 
same basic framework here as we did in analyzing , yi a \-), and will apply the 
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truncated norm in (6.11) to (6.12), which we will show is violated unless the 
limit in (6.3) holds. As in the analysis of "yi a \-), we set M = L% here; we 
will set M' = L4, which will be a function of L3 (and hence of L\) and will 

be defined in (6.23). As in the demonstration of (6.3) for 72 (•), we continue 
to drop the superscript (a) when convenient and the context is clear. 

The argument relies heavily on the constructions employed for the demon- 
stration of case (c) of Proposition 5.1. We briefly recall the quantities that 
were employed and whether they depend on a € A. The quantities if)z('), 
ipw{')i V'aQ) L2, L3, Mi, ei, €2 and €3 were specified in the previous subsec- 
tion and do not depend on a. As before, the quantity L\ is allowed to vary, 
with L2 and L3 being functions of L\. We recall the terms T and h(-) from 
(1.17), and the terms p(-), w(-) and t(-) from (5.18)-(5.20), none of which 

o . o 

depends on a. Instead of employing C4 = ^ n M n / ^ mm as in (5.19), we now 
set 

(6.22) C 4 = L 3 . 

Also, rather than employing the quantities Ml and Mr from (5.12) and 
(5.21), we use L3 and 

(6.23) U = ^ w fX>WJ • 

Defining w' ni as before Lemma 5.1, the conclusions of the lemma continue 
to hold if > L4 and z n < L3 replace the conditions on ||x||R jn and z n 
there, with the argument in the proof being the same. If w^f 1 > L4 and 
z n < L3, for given x and n, we define t X)U in terms of i X)Tl as in (5.23), but 
with i XjH being the smallest index i at which w' n i > w(i). 

Both i X)U and t x ^ n depend on x, and we wish to employ a fixed time that 
does not. For this, we note it follows from (6.21) that, for each a £ A, there 
is a value i'W at which 

(6.24) f W (i XA = i^, W[ a) > L 4 , Zi < L 3 ) > 2- ( ' ) 7i s '(L 3 ). 
After having chosen i^ a \ we set 

(6.25) t (a) = (m max ) (a) (t{i^ - 1) + l) . 

We also denote by B n a \ n = 1, ... ,N, the events where at most T po- 
tential arrivals occur at queue n by time t^ a \ their service times are each 
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at most 2(m max )( a ), i X ( 0)>n = , W^ a) (0) > L 4 and Z n (0) < L 3 . It follows 
from (5.25) and (6.24) that 



(6.26) 



P e M(BV>) > 2- l(a) - v p(i^ - l)^ a) (L 3 ) for all n. 



In the following proof, i^ a \ and B^ will assume the roles played by i x , 
t x and B x in the proof of case (c) of Proposition 5.1. 

PROOF of (6.3) for jw (■)• We first note that, for given M, 



(6.27) 



E £ (a) 



||X(0)||( a '^ - \\X(t^-)\\^ 



> E £ (a) 



||X(0)|| (a ' Ll) - ||X(^-)||( a ' Ll ); ||X(0)|| (a ' Ll) < M 



since \\X(t^-)\\^ < \\X(t^-)\\^ L ^. Setting 
(6.28) 



A J a) = [ll^(0)llfc Ll) - \\X(tM-)\\%^-B(?\ ||X(0)||(^) < M 



and 
(6.29) 



A 



(a) 



E £ (a) 



||X(0)||(^O _ ||X(t( a )-)||( a ' Ll );||X(0)||( a ' Ll ) < M 



(a) 

4 ' 



we rewrite the right side of (6.27) as 
(6.30) 



E £ (a) 



||X(0)|| (a ' Ll) - \\X(t {a) -)\\ {a ' Ll) ; ||X(0)|| (a ' Ll) < M 



A (») +A (a)_ 



We first analyze A^ a \ which will be the main term. The same reasoning 
as in (5.27) shows that A^ is at least 

l (m ma x) (a) °(a) N (a) p £(a} || X (0) \\ ^ > < m) (^M^) - 1) 

This is at least 



-(m max )^ /»iV» (2^ i(a) ~ r p(^ a ) - l)7J a) (L 3 ) - ^ ] {M) 



(6.31) 



x4(^ (a) )-i) 

> C 4 A (a) iV (a) 7j a) (L3)t( a ) 



for ^{L-i) > 0. The first line follows from (6.26), with S^(M) \ 
as M / oo, and the following inequality follows (like the last equality in 
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(5.28)) from (5.19), for large enough M. Combining the above inequalities, 
one obtains, for large enough M, 



(6.32) 



This holds trivially for 7^ a) (L 3 ) = 0. 
We claim, on the other hand, that 



(6.33) 



The argument for this is essentially the same as that given for (5.29) in the 
proof of case (c) of Proposition 5.1. On intervals between arrivals, ||A(i)||^ jn 
is decreasing for all n, and so, for all uj, 



.34) 



\X 



(a,Li) _ 1^ || X 

2 ^ 11 



< 



l 



\x(t)t^--\\. 



(a,Li) 
R 



holds almost everywhere. Also, at the time Tj of an arrival in the network, 



|X(T,)||(^) - \ WXm^hiu G flW} 



(6.35) 



)||C«^) - \Y, WXm-n^hW € B^} 

< ||x(Ti)||( a ' Li ) - ||x(ri-)||( a ' Li ) 

since \\X (T^^ > WX^-)^'^ . Arguing as after (5.32), one obtains 

the upper bound /i( a ) iV"( a ) for the right side of (6.34), and for the expecta- 
tion, over ||X(0)|| (a ' Ll) < M, of the right side of (6.35). Application of these 
bounds, together with the strong Markov property, will then imply (6.33). 
Combining the bounds in (6.32) and (6.33), it follows from (6.27) that 



(6.36) 



E £ (a) 



|X(0)||( a '^ - \\X{t^-)\\^ 
> C 4 £ (a) iV (a) 7<i a) (L3)t (a) - /»7V( a V a ) 
for large M. Since we have set C4 = L3, this is at least 

(L 37l i a) (£3)-l) £ (o) W (o) t (o) . 
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On the other hand, since £^ is invariant, the left side of (6.36) equals 0. 
Therefore, 

lfr\L z ) < 1/L 3 , 

and hence ^u> a \L z ) — > uniformly in a as L3 — > 00. This implies (6.20), and 
hence (6.3) for "Yw\-), as desired. □ 

At the end of Section 5, it was noted that the proof of Theorem 1.1 sim- 
plifies for certain service disciplines, such as PS and FIFO. In particular, the 
proof of case (c) of Proposition 5.1 can be replaced by a simpler argument, 
as was outlined at the end of that section. The same is true for the demon- 
stration of (6.3) for jw (■)■ For disciplines such as PS and FIFO, one can 
give a simpler argument along the lines of (6.3) for "yi a \-), by investigating 
the decrease in the expected value of the corresponding norms over a small 
enough time interval [0,t]. The reasoning is similar to that provided at the 
end of Section 5. Analogs of the other observations at the end of Section 5 
also hold in the uniform setting of Section 6 as well. 

Tightness and relative compactness of families of networks. We are in- 
terested here in the behavior of the projections of the equilibria measures 
and onto (S') (a) and (S') (a) , a € A, for families A of JSQ net- 
works, with = K, for (S") (a) and (S') (a) defined as in Section 2. The 
networks will be assumed to satisfy the hypotheses of Theorem 1.3. The 
measures £^ a \ a G A, are the natural extensions of £^ a \ a £ A, from S( a ' 
to S*( a ); they are concentrated on S^ a \ The spaces (S')^ and (S')^ are 
assig ned a fixed value N' , N' < N^, for all a £ A (corresponding to the 
first N' queues); they are identical for different a, and hence can be denoted 
by S' and S' . Such S' and S' , which are projections of each and S^ a \ 
will be referred to as common projections of the family A. We denote the 
corresponding projected measures by (fW)' and 

As examples of such families of networks, one can think of JSQ networks 
indexed by N, the number of queues for the network, with the number of 
arrival streams K being fixed. For N > N', for given N', the networks 
will share the common projection S' obtained by retaining only the first 
N' queues. As N — > 00, one can investigate the limiting behavior of the 
projections (£Wy of the equilibria onto S' . 

To examine the common projections of £ ^ and £~( a \ for a £ A, we recall 
from Section 2 the compact sets Em in S and the compact sets Em in S. The 
corresponding sets for the projections S' of S and S' of S will be compact 
with respect to their respective metrics; we also denote these sets by Em 
and Em, respectively. 
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Under the assumptions of Theorem 1.3, the uniform limits (1.24) and 
(1.25) hold. The projections onto S' of the measures £^ are therefore 

tight with respect to the induced metrics. That is, for each e > 0, there is 
a compact set B C S' so that (8 {a) )'{B) > 1 - e for all a £ A. The sets 
Em can be employed to see this: The conditions z n < M, < M and 

< < M in the definition of Em clearly suffice for (1.24). On the other 
hand, the probability of at least one arrival in the network over (0, 
from a given arrival stream k is at most 1/M for a network in equilibrium, 
and so the condition 1/M < < M suffices for (1.25). 

For certain service disciplines, the projections (£^)' onto S' of the mea- 
sures <p( a ) are also tight. In addition to employing the reasoning of the pre- 
vious paragraph, one also needs to show that 

(6.37) sup£ (a) (w$ < 1/M for some i, Z n > o) -> as M -> oo, 
adA v ' 1 

for given n. When the projections (£^)' are tight, it will be more informative 
to work with them than with the projections (S^)', since all limits will be 
concentrated on S'. 

Tightness of (S^)' is not difficult to show for PS networks since, when 
the number of jobs at a queue is bounded, each job must be served at at least 
a given rate. When there are jobs with scaled residual service times close to 
0, these jobs will quickly leave the queue. On the other hand, the scaled rate 
at which jobs leave the queue is bounded in equilibrium, which therefore 
gives an upper bound on the expected number of jobs in equilibrium with 
scaled residual service times close to 0. 

For FIFO networks, (6.37) will not be true in general since, depending on 
the choice of the network o, the distributions Fj a \-) might be concentrated 
arbitrarily close to 0, and jobs that are not the oldest at their queue will not 
be served until the departure of older jobs. For families A of networks where 
the service time distributions do not depend on a, (6.37) is not difficult to 
check, with the argument being similar to the argument for PS networks 
just mentioned. On the other hand, equation (6.37) will hold for general 
service distributions in the setting obtained by restricting the state space 
S by suppressing information on the service times of jobs for which service 
has not yet begun; this setting was mentioned at the end of Section 5. The 
argument in this setting proceeds as before. 

Families of measures [8 , a £ A, need not be tight for arbitrary service 
disciplines. This is the case for LIFO service disciplines, even when jVW = 
j(( a ) = \ pgr example, consider a family of networks A = {5, 6, . . . , N, . . .} 
having service distributions F^ N \-) with = 2 and point masses of size 
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at least 1/3 at 1, and having interarrival distributions G^ N \-) with a^ N ' = 1 
and support on (0, 2], and with point masses of size at least 1/2 at 1 — I/TV. 
With probability at least 1/2, an arriving job will immediately begin receiv- 
ing service that continues until its residual service time is reduced to 1/iV, 
at which time its service is taken over by a new arrival. Using this, one can 
show that, for each N, the probability under the corresponding equilibrium 
measure £( N ) of there being at least one job with residual service time at 
most 1/N is at least 1/25. This contradicts (6.37) and hence contradicts 
tightness. 

If on the other hand, for a family of networks with LIFO service disci- 
plines, the arrival streams are Poisson, it seems clear that (6.37) will hold, 
although the author does not see how to show this. 

A family of probability measures on a metric space is relatively compact 
if, for each sequence drawn from the family, there is a subsequence that con- 
verges to some probability measure on the space. It follows from Prohorov's 
Theorem that a tight family of measures is automatically relatively compact 
(see, e.g., [2]). Combining this with the preceding discussion of the projected 
measures (fto)' and {s^y on common projections S' , respectively 5", for a 
family of JSQ networks, one obtains the following conclusion from Theorem 
1.3. 

Theorem 6.1. Suppose that a family A of JSQ networks, with = 
K, satisfies the uniformity conditions (1.13)-(1.17) and (1.23), and that for 
each network a £ A, Am = {x : ||a;||^ < M} is petite for each M > with 
respect to the norms in (1.18). Then the projected measures on each 

common projection S' of the family A are relatively compact. Moreover, 
if (6.37) also holds, then the projected measures (fW)' on each common 
projection S' are relatively compact. 

By restricting the family A of networks under consideration, one obtains 
the following analog of Corollary 1.2. 

Corollary 6.1. Suppose that each member of a family A of JSQ net- 
works has a single Poisson arrival stream, that Fj does not depend on j or 
a, that the selection rules are mean field and have uniformly bounded support, 
and that (1.26) holds. Then the projected measures (£•(») )' on each common 
projection S' of the family A are relatively compact. Moreover, if (6.37) also 
holds, then the projected measures (5(°)y on each common projection S' are 
relatively compact. 
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7. A family of JSQ networks with large workload. In Section 6, 
we demonstrated Theorem 1.3. The limit (1.24) there states that in equi- 
librium, at each queue, the tails for the distribution on the number of jobs, 
their weighted ages and the weighted workload can be bounded uniformly 
for general families of networks. This bound does not depend on the ser- 
vice discipline. If one examines the proof of the theorem, one sees that the 
bounds that are obtained for the weighted workload are actually extremely 
weak. The term M' in the definition of 7™ (•) in (6.21) is given by M' = L,±, 
with L4 being defined in (6.23) in terms of the sequence w(l),w(2), . . . and 
L3. As noted after (5.21), w(i) will often increase very rapidly. In fact, one 
can check that, for Poisson arrival streams and service distributions having 
any given number of moments, w(i) can grow like 

(7.1) w{i + 1) > e bw{i \ 

where b > depends on the number of moments. The growth of M' in terms 
of M, with M = L3, will therefore be far too rapid to infer anything useful 
quantitatively about the tail of the distribution of the weighted workload in 
equilibrium for a member of the family of networks. 

Rather than providing quantitative information, the purpose of the uni- 
form bounds in (1.24) of Theorem 1.3 was to establish tightness of the dis- 
tributions on the number of jobs, weighted ages and weighted workload at 
a queue. This was discussed in Sections 1 and 6. One can, however, ask 
whether the rapid growth exhibited in (7.1), and hence by L4, is an artifact 
of the proof or whether similar bad behavior is in fact possible for the work- 
loads of subcritical JSQ queueing networks. This point is of course relevant 
in deciding whether it is always advantageous to employ the JSQ rule for 
assigning arriving jobs, as opposed to, say, randomly choosing the queue, 
e.g., letting D = 1, in the setting of Section 1. 

In this section, we present a family of networks for which the weighted 
workload exhibits bad behavior in an extreme manner that is of the order 
as that suggested by (7.1). The structure of the networks in the family is 
elementary in most aspects. Each network in the family possesses a single 
Poisson arrival stream and two queues to which jobs are directed; the state 
space S is therefore the same for each network. The selection set always 
consists of both queues, and the service distributions are both class and 
station independent. When the two queues have equal numbers of jobs, an 
arriving job will choose each of them with probability ^. The rates of the 
Poisson arrivals will depend on the network as will the service distributions, 
but the traffic intensity for the different networks will be uniformly bounded 
away from one. The service discipline will be the same for all networks, but 
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the discipline is concocted so as to produce inefficient service. In particular, 
service will be nonlocal in the sense that the choice of which jobs to serve 
will depend on the entire state of the network. 

As we will see, this discipline will cause certain jobs with large service 
times to be served very slowly, with service being concentrated on the jobs 
with shorter service times at that queue, which therefore quickly leave the 
queue. The other queue will, during such times, serve its jobs with longer 
service times first, which causes it to accumulate jobs with short service 
times. The presence of these unserved jobs induces arriving jobs to be di- 
rected to the first queue. Since these arriving jobs at the first queue will 
be served before a job with large service time is served, this slows down 
the rate at which such a job is served. Before such a job with large service 
time receives all of its service, with high probability, a job with substantially 
greater service time will enter the queue, which causes the workload at the 
queue to grow. Iterating this behavior produces the growth in workload in 
which we are interested. This behavior occurs with few jobs at the queue in 
relation to its workload. 

The remainder of the section is organized as follows. We first complete 
the description of the family of JSQ networks whose description was begun 
in the next to last paragraph. We then state Theorem 7.1, which describes 
the behavior of the weighted workload, in equilibrium, of these networks. 
The proof of the theorem consists of three parts and includes an induction 
argument on the size of a quantity related to the weighted workload. 

The family of queues is indexed by e, with e > 0. Rather than directly 
give the service distributions and Poisson intensity of arrivals, we specify 
them as follows. The service distribution F^(-) of jobs at each network 
is assumed to be discrete, having point masses at h(0), h(l), h(2), . . ., with 
h(0) = 5 d = 7o e, for 7o G (0, h(l) = 1, h{2) G 2Z + with h(2) > c u 
where c\ > 100 will be specified in the proof of Proposition 7.1 and does not 
depend on e or 70, h(3) = (/i(2)) 3 , and 

(7.2) h(i + l) = e^ ) for i = 3,4,... ; 

we will restrict the family to indices e with e < l/(/i(3)) 5 . (For convenience, 
we set h(— 1) = 0.) We will refer to jobs with (initial) service time 5 as 
quick, those with service time 1 as moderate, and those with service time 
h(i), i > 2, as large. Jobs with these service times are assumed to arrive in 
the network at rate 2/e for /i(0), 2(1 — rj) for h(l), where n G (0, j^j], and 
at rate 2{h{i))~ i for h{2), h(3), . . . . 
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One can check that the traffic intensity p is given by 

oo 

(7.3) p = 7o + i_ r? + ^(/ l(i ))l-i. 

i=2 

We require that 

(7.4) P <1- v /2, 

which is easy to do, because of the freedom in our choices for 5 and h(2). 
Note that the arriving moderate jobs require most of the work. Because of 
the growth of the exponents in the arrival rates for h(i) as i oo, the 
distribution functions F^(-) have all moments. Note that as e \ 0, the 
mean of the service time goes to 0. This, together with the fixed arrival 
rates for moderate and large jobs, implies that the uniformity condition 
(1.13) in Theorem 1.3 is not satisfied. 

We still need to specify the service discipline; as mentioned above, it 
depends on the entire state of the network. We first introduce some termi- 
nology. At each time, one of the queues will be the designated queue, with 
the other being the other queue. Provided the network is not empty, the 
designated queue will not be empty, in which case one of its jobs will be the 
designated job. We denote by Y^\t) the residual service time at time t of 
the designated job; when both queues are empty, set Y^ e \t) = 0. 

At t = 0, if the network is not empty, we choose one of the nonempty 
queues as the designated queue and a job with the largest residual service 
time for that queue as the designated job. This queue will remain the des- 
ignated queue until it is empty and the other queue is not. 

As time increases, a job becomes the designated job upon its arrival at the 
designated queue, if its service time is larger than or equal to the residual 
service time of the current designated job (and automatically becomes the 
designated job if the queue is empty). Only new jobs can become designated 
jobs, with the exception being when service of a designated job is completed 
(that is, the job leaves the network). It will follow from the service discipline 
given in the next paragraph that the queue must then be empty; the other 
queue (if not empty) then becomes the designated queue, with the job with 
the largest residual service time at the queue becoming the designated job. 
In order to indicate the designated queue and job, an additional coordinate 
needs to be added to the state space S (which we continue to denote by S). 

The allocation of service is specified as follows. At the designated queue, 
the designated job will only be served when there are no other jobs there. 
When a job arrives or departs from the network, among the nondesignated 
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jobs, the job with the shortest residual service time at the designated queue 
receives all of the service at the queue; this service continues until the next 
arrival or departure from the network. At the other queue, upon an arrival 
or departure, the job with the longest residual service time receives all of 
the service, with the job continuing to receive this service until the next 
arrival or departure. When the designated queue and other queue switch, 
the service rules also switch. One can check that the designated job has the 
largest residual service time among jobs at its queue, although not neces- 
sarily among jobs in the entire network. 

By (7.4), each member of the family of JSQ networks defined above is 
subcritical. Moreover, since the arrival stream is Poisson, the conditions 
(1.11) and (1.12) are satisfied. Consequently, by Corollary 1.1, the Markov 
process X^(-) underlying each network is positive Harris recurrent, and so 
has an equilibrium measure £^ e K The following result gives a lower bound 
on the distribution of the sum Wf' + W^ of the weighted workloads of £ ^ . 
(As shown in the proof, the same bounds hold for the sum of the unweighted 
workloads.) Here, we set 71 = ^00 • 

Theorem 7.1. For the family of networks defined above, 

(7.5) (w} e) + W^ ] < h(l<n/e\)) < WL7iAJ). 

On account of the recursion for h(i) given by (7.2), h(\_ji/e\) will be enor- 
mous when 1/e is a moderate multiple of 2000, and in particular grows much 
more rapidly than 1/e; presumably, this rapid growth also holds for much 
smaller values than 2000. The square root in (7.2) is chosen for convenience; 
it can be replaced by any power strictly less than 1, if h(3) is replaced by 
a correspondingly higher power of h(2). Note that the rate of growth given 
here is somewhat slower than that of w(i) in (7.1). 

It follows from (7.5) that the weighted workload under £ ^ is typically at 
least of order Lti / e J ) • This contrasts with the mean weighted workload for 
the equilibria of the networks obtained by setting D = 1, which is of order 
1/e. (The mean workload for the equilibria of those networks is bounded 
over all e.) 

Theorem 7.1 holds because, when a designated queue has designated job 
with residual service time h(i), a job with service time h(i + l) is more likely 
to arrive at the queue before the residual service time of the designated 
job decreases to h(i — 1), provided that i < 71 /e. One can therefore com- 
pare X^(-) with a discrete time birth-death process on 0, 1, . . . , |_7i/ e J + 1; 
with uniform positive drift. The comparison ceases to be valid when the 
designated queue has too large a multiple of 1/e jobs. 
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Proof of Theorem 7.1. In order to make the preceding paragraph precise, 
we will employ a recursion argument that requires two propositions. The 
first proposition asserts that the above behavior occurs for i = 2; the second 
proposition employs the first proposition and an induction argument to show 
this behavior for general i. Since the proof of the first proposition is fairly 
long and involves a substantial number of estimates, we give a condensed 
proof of it. 

For these propositions, we need to employ a modification of the pro- 
cess X^(-). We define a new family of Markov processes X^ e,K \-), k = 
0, 1, . . . , K^ e \ with = |_7iAJ , where X^ e ' K \-) evolves in the same man- 
ner as X^(-), but with the following modification of the JSQ rule: an arriv- 
ing job at time t selects the queue with the smaller value of (t—) + k 
and Z^' K \t—), where Z^' K \-) and Zq (•) denote the number of jobs at 
the designated queue and other queue for the network indexed by (e, k). 
The term k will serve as a "handicap" for the designated queue. As before, 
we denote by y( e ' K )(.) the residual service time of the designated job. Note 
that, for k = 0, the queue selection rule is JSQ and Y^°\-) = Y^(-). 

Proposition 7.1 gives the following uniform bound on y( e ' K )(.) over all 
k < . We denote by y the residual service time of the designated job for 
a state x £ S. 

Proposition 7.1. Suppose that x 6 S is any state for which y = 
h(2). Let T denote the stopping time at which either Y^ e ' K \T) = h(l) or 
= h{i), for i > 3, first occurs. Then, for k < K «, 

(7.6) P x (Y {e ' K) (T) = h(l)) < l/h(2). 

The bound (7.6) depends strongly on our construction of X^ e,K \-), where 
the designated queue tends to receive more jobs than the other queue. 

Proposition 7.2 generalizes Proposition 7.1 to h(i), 2 < i < + 1, by 
applying Proposition 7.1 together with an induction argument. The propo- 
sition is stated for k < — i + 1, although only the case k = is directly 
applied in the demonstration of Theorem 7.1. General k are needed for the 
comparison in the first paragraph of the proof in the induction argument. 

Proposition 7.2. Suppose that x £ S is any state for which y = h(i) 
for given i, with 2 < i < + 1. Let T denote the stopping time at which 
either Y^ K \T) = h(i - 1) or Y^(T) = h(l), for £> i + first occurs. 
Then, for k < - i + 



(7.7) 



p x (y(*>«)(T) = h(i-i)) <i/h{i). 
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We first prove Theorem 7.1, assuming Proposition 7.2. As mentioned ear- 
lier, the argument compares Y e (-) to a birth-death process. 

Proof of Theorem 7.1 assuming Proposition 7.2. Theorem 7.1 will 
follow from Proposition 7.2 by using the reasoning outlined in the paragraph 
before the beginning of this subsection. To see this, we first consider the se- 
quence of stopping times T(0), T(l), T(2), . . . that are defined inductively as 
follows, when Y^(0) = h(io) for some tQ > 1. We set To = and, when 
io > 2, denote by T(l) the first time at which either (T(l)) = h(io — 1) 
or y( e )(T(l)) = h(£), for I > i + l. When i = 1, T(l) denotes the first time 
at which Y^(T(l)) = h(£), for £ > 2. The times T(2),T(3), ... are defined 
analogously. 

For n = 0, 1,2, . . ., we define the function H[ £ \n) by setting H[ e \n) = 
i A (K^ + 1), where % is the value at which Y^ e \T(n)) = h(i), and denote 
by Q n = J~T(n) the cr-algebra at this time. It follows from the strong Markov 
property and Proposition 7.2, with k = and 2 < i < K^ e > + 1 that, for 
H[ e \n)=i, 

(7.8) P(H[ €) (n + l) = i-l\g n )< 1/4 

for all e; when i = 1, the corresponding probability is 0. 

One can compare H^(-) with the birth-death process H^i-) on i = 
1,2, ...,K^ + 1, where 

P{H^\n + 1) = t - 1 1 fl£ e) (n) = *) = V 4 , 
P(H { 2 e \n + 1) = t + 1 1 fl£ e) (n) = *) = 3/4, 

for i 7^ 1, K^' + l, with the values i instead of i— 1 and i instead of i+1 being 
taken at 1 and ^ + 1, respectively. In particular, under H^\o) = Hi (0), 
one can couple _ff{"^(-) and H^\-) so that 

(7.9) #{ e) (n) > Ff^) for a11 n - 

Note that fl^O) is Markov, but h[ € \-) is not. 

Let £^ denote the equilibrium measure on {1,2,... + 1} of the 

process iZ^O)- The process is reversible with respect to , with 

(7.10) S®{B) = 1/3, + 1}) = 2/3, 
where B = {1, 2, . . . , K^}. 
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Also, let L^\(A), L { ^ 2 (A) denote the number of visits for H[ e \-), H^\-) 
to a set A, over 1, . . . , N, starting from any initial state. By (7.9) and (7.10), 

limsup-U^B) < lim ^L$ 2 (B) = 1/3, 

liminf ^4!i({^ (£) + 1}) > J™ ^4! 2 ({^ (e) + 1}) = 2/3 

both hold almost surely. 

When Q n is given, with i/j^(n) = i < i-T^, then T(n + 1) — T(n) is 
stochastically dominated by an exponentially distributed random variable 
with mean 2^(* + l)* +1 - On the other hand, when Q n is given, withff{ e) (n) = 
K( € > + 1, then T(n + 1) — T(n) stochastically dominates a random variable 
with mean \h{K^ + 2) K{t)+2 . The upper bound is due to the rate at which 
jobs of size at least h(i+l) enter the network, as given below (7.2). The lower 
bound employs this rate, together with (7.8), to compare T(n + 1) — T(n), 
restricted to the outcome H[ e \n + 1) = + 1, to the restriction of an 
exponentially distributed random variable. 

We let 

U$(A)= / l{Y®(t)eA}dt 
Jo 

denote the occupation time for Y^(-) of the set A up to the random time 
T(N), starting from a given initial state. Recall that if H[ £ \n) = i, then 
Y& > h(i - 1) for t e [T(n),T(n + 1)). It follows from this, (7.11), the 
two bounds in the previous paragraph, and the strong law applied to the 
incremental times T(n + 1) — T{n) that 

limsupi-C/f ([0,h(K^)}) < 1 • \ ■ h(K& + l) Ki ' )+1 , 

hminf l<((M^),oo)) > \ ■ \ ■ h(K® + 2)^ (e) + 2 

A<— >oo IV o o 

hold almost surely. Since U${[0,oo)) = T(N), it follows that 

U$([0,h(KU)]) 2h(K^) + \)K^+i 

(7.12) T-,™ P T(N) ~ h{K&+2) K{ ^ 

< l/h(K® +2) < l/h{K^). 

Since the process X^ e \-) is ergodic, it follows from (7.12) that 

(7.13) £^(Y^ < h(K®)) < l/h{K^). 
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The residual service time of the designated job is bounded above by 
the (unweighted) workload in the network, which is at most 2e(W 1 + W 2 ) 
since the mean service time is less than 2e. Theorem 7.1 therefore follows 
from (7.13) and the definition of K^ e \ after dropping the 2e coefficient. □ 

We now prove Proposition 7.1. The proof for the proposition is fairly 
long and requires a number of steps, where one obtains various bounds on 
the number of jobs entering each of the two queues. For these bounds, we 
will repeatedly employ elementary large deviation bounds involving sums of 
independent random variables. We assume the reader is familiar with these 
types of estimates and sketch the corresponding steps. The first part of the 
proof, through (7.21), is given in detail. 

Condensed proof of Proposition 7.1. Our basic reasoning will be 
as follows: When the other queue is not empty, there will typically be either 
moderate or large jobs there. On account of the service discipline at the 
queue, the moderate and large jobs will be served before the quick jobs, 
which allows quick jobs to accumulate. This contrasts with service at the 
designated queue, where quick jobs are served first and so do not accumulate. 
Therefore, because of the JSQ rule, when there are jobs in the other queue, 
most arriving jobs will select the designated queue. Since there will be jobs 
in the other queue most of the time, this implies that the amount of work 
in the designated queue will typically have a positive drift. The probability 
will therefore be small that Y^' K \t) < ±/i(2) before time t = f (/i(3)) 4 = 
(/i(2)) 12 . But by time to, there will, with high probability, be many arriving 
jobs in the network with service time at least h(3), and hence at least one 
such large job will arrive at the designated queue. So (7.6) of the proposition 
will follow. 

We first introduce some notation consisting of the events A\ (n) , A^ (n) , 
. . . , Ag(n), with n = 1, . . . , to, and the stopping time To. We denote by Ai(n) 
the event on which the amount of work present at the designated queue at 
time n is at least \h{2) and at all previous times is at least \h{2) — 1. Note 
that the designated queue remains the same over [0,n] on A\(n). It is easy 
to see that, under Ai(n), Y^ e ' K \t) > \h(2) - 1 for t < n + 1, that 

(7.14) Ai(n) c = forn<-fc(2), 

and that y( e > K )(T) = h(l), with T < to, can only occur when Ai{to) c occurs, 
where T is given in the statement of the proposition. 

All of the other events A2(n), . . . , As(n) are assumed to be subsets of 
A\{n) that, in addition, satisfy the following properties. The event A2{n) 
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occurs if the designated queue at no time in [0,n] has more than l/(1000e) 
jobs; A^{n) is the event for which the other queue at no time in [0, n] has 
more than l/(500e) jobs that were not present at time 0. The event A±{n) 
occurs if the subset of [0, n] on which the other queue is empty has Lebesque 
measure at most j^n. The event A^(n) occurs if a total of at most n/(200e) 
jobs ever arrive at the other queue at times in (0, n] at which there is a job 
in the queue with residual service time strictly greater than 6; Aq(ti) is the 
event on which at most of these arriving jobs are moderate or large. 
Let To be the first time at which the other queue has at most l/(500e) 
jobs. The event Aj{n) occurs if the subset of [7o,n] on which the other 
queue is not empty and has only jobs with residual service times at most 
5 has Lebesque measure at most ^j^. The event A${ti) occurs if at least n 
moderate or large jobs arrive at the designated queue over (0, n]. Except 
for (7.15), where ^4s( n ) is employed, the precise definitions of the events 
A2(n), . . . ,As(n) wm n °t be needed until the paragraph after (7.21). 
One can check that 

(7.15) Ai(n) c C A 8 (n - If for n > hi{2). 

For this, note that A]_{n-l) c C ,4 8 (n-l) c and that if uj G ,4i(n-l)nyli(n) c , 
then the amount of work present in the designated queue at time n is less 
than |/i(2), which implies u G As(n — l) c . 

In order to show (7.6), we may assume that, at time 0, the designated 
queue has only a single job, consisting of a designated job with residual 
service time h(2). Otherwise, one can wait until all jobs, except for the 
designated job, leave the queue, or a job with service time at least h(3) 
arrives at the designated queue. Since the designated job is served last, one 
or the other event must eventually happen. If the former occurs, reset the 
time to 0. 

Let B(n) be the subset of A\(n) on which no job with service time at 
least /i(3) arrives at the designated queue by time n. Most of the work in 
showing (7.6) consists of showing the following bounds involving B(ta) and 
A e (n): 

(7.16) P x (B(t )) < 2e~ c ^ 
and 

(7.17) P x {A x {n) n A t (nf) < e~ c ^ for t = 2, . . . , 8, 

for integers n G [^/i(2),to] and appropriate C2 > not depending on the 
choice of ci, which is given before (7.2). The inequalities (7.16) and (7.17), 
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for I = 8, will be applied in the next paragraph. The other inequalities in 

(7.17) are needed to show this inequality. We assume that c\ is chosen large 
enough so that (c^) 14 < e c i C2 for d x > c\. 

We now show (7.6) and afterwards motivate the inequalities. We may 
assume, by induction on n, 

(7.18) P,(ii(n-ir)<(n-l)e- C2h(2) ; 

(7.18) trivially holds for n — 1 < \h{2) on account of (7.14). Because of 
(7.15) and (7.17), with I = 8, it follows from (7.18) that 

(7.19) P^A^nf) < ne- C2h{2) . 
Consequently, (7.19) holds for all n < to and, in particular, 

(7.20) P x (Ai(t ) c ) < toe~ C2h{2) = (h{2)) 12 e- C2h{2 \ 

As pointed out after (7.14), Y^ e ' K \T) = h(l), with T < to, can only occur 
under this event. 

We have so far not employed (7.16). By (7.16), 

P x {T>t Q ;A l {t ))<2e- C2h ^. 

Together with (7.20), this implies that the event in (7.6) occurs with prob- 
ability at most 

(7.21) 2(/i(2)) 12 e" C2h ( 2 ) < l//i(2), 

with the inequality holding because of the assumption on c\ given above and 
h(2) > c\. This implies (7.6). 

The rest of the proof of (7.6) is spent justifying (7.16) and (7.17). We 
begin by justifying (7.17), which requires most of the work; (7.16) will then 
follow quickly. 

The exponential bounds in (7.17) follow from elementary large deviation 
bounds, involving sums of i.i.d. random variables, that can be obtained by 
using Markov's Inequality or by comparison with a birth-death process. We 
assume here that the reader is familiar with such bounds. Since the bounds 
for some of the previous indices £ are used to derive (7.17) for larger £, the 
coefficient C2/ in the exponent of the inequality corresponding to (7.17) for 
a specific t may depend on £; after computing this bound for each £, one can 
then set C2 = min^C2^ to obtain (7.17). 

The inequality in (7.17) for A2(n) c follows from elementary large devi- 
ations bounds and the JSQ rule (with handicap k). To see this, note that 
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there is initially only one job at the designated queue and (a) The number 
of moderate and large jobs entering the network will be of smaller order of 
magnitude than 1 /(lOOOe) off of a set of probability e~ h ^ (since 1 / (lOOOe) > 
(/i(3)) 5 /1000 > h{2) and t = (/i(3)) 4 < l(T 6 (/i(3)) 5 < W~ 6 /e). (b) Work 
from quick jobs enters the network at rate at most 1/100 (since 70 < 1/200), 
which is less than the rate 1 of service at the designated queue. Comparison 
with the corresponding birth-death process therefore implies that the net 
number of quick jobs entering and leaving the designated queue up to any 
time before to is of smaller order than l/(1000e), off of a set of probability 
e~ h ^ ( S mce t e- 1/(1000e) < (/i(3)) 4 exp{-(/i(3)) 4 } < e~ h ^). 

The inequality for Az(n) c follows immediately from the inequality for 
A2{n) c , since it is assumed that the handicap k is at most K e < l/(2000e) 
and so l/(1000e) + K e + 1 < l/(500e). 

In order to show the inequality (7.17) for ^4 4 (n) c , note that when the other 
queue is empty but the designated queue is not, the rate at which jobs with 
service time 1 arrive at the other queue is 2(1 — 77) > 9/5. When these jobs 
are being served, the other queue is not empty. Under Ai(n), the designated 
queue is never empty on [0, n]. Noting that 1 — | • ^ = ji, the frequency 
over [0, n], for which the other queue is empty, is therefore typically at most 
5/14. The inequality in (7.17) follows from an elementary large deviations 
bound. 

We next show (7.17) for A^(n) c . On Ai(n), the other queue remains the 
same over (0, n]. Over this interval, the set of times having a job at the 
other queue, with residual service time strictly greater than S, is the union of 
disjoint intervals, each with length at least 2/3, except for the first interval 
(since h(l) = 1 and 5 < 1/3). The number of such intervals is at most 
|n + 2 < In. Note that no quick jobs are served at the other queue over 
such intervals. Also, at most n moderate or large jobs entering the other 
queue over (0, n] can complete their service over (0, n\. It follows that, on 
Ai(n) n A-s(n), there are at most l/(500e) jobs arriving at the other queue 
over any such interval, and hence at most n/(250e) + n < n/(200e) jobs 
arriving at the other queue over such intervals. The inequality (7.17) for 
A§{n) c therefore follows from that for A^{n) c . 

Ordering these at most n/(200e) jobs in the order in which they arrive at 
the other queue, the service time of a job is independent of the service time 
of previous such jobs (although the total number of arriving jobs will not 
be independent). Since the probability a job is either moderate or large is 
less than e, the probability of there being at least n/100 moderate and large 
jobs among n/(200e) jobs, n > h(2)/2, is exponentially small in h(2). The 
inequality (7.17) for AQ(n) c therefore follows from that for A§(n) c . 
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For Af(n) c , note that there will typically be approximately 2n/e arrivals 
in the network by time n > \h{2) and so, by an elementary large deviations 
argument, there will be at least 3n/e arrivals with an exponentially small 
probability in h(2). At time To, there are at most l/(500e) jobs at the other 
queue, and so there will be a total of at most An/e jobs at the other queue 
after time To. The amount of time spent serving a job after it has residual 
service time 5 is of course 5, and so the time spent serving all of these jobs 
is at most 4(5/e)n < ^n, which demonstrates (7.17) in this case as well. 

We still need to show the inequality for A$(n) c . For this, note that by 
(7.17), with t = 4 and t = 7, 

(7.22) P x (Ai(n) n {A A {nf U A 7 {n) c )) < 2 e ~^ Ac ^ h ^ , 

and denote by H C [To,n] the random set where there is at least one job 
at the other queue with residual service time strictly greater than 5. Then 

(7.22) gives an upper bound on the probability that A\{n) occurs and the 
measure of H is less than 

T ( 5 j. M >> 7 T 

n — in — — n-\ n > — n — In. 

V 13 50 J ~ 12 

Setting H' = (0, Tq\VJH, this implies \H'\ > j^n off of the exceptional event. 
Since the service time of an arriving job is independent of the service times 
of previous jobs and r/ < 1/100, a large deviations bound implies that at 
least |n moderate and large jobs enter the network on H', off of an event 
of exponentially small probability in h(2). 

On the other hand, on A^n), at most moderate and large jobs 

arrive at the other queue during H. Also, on ^(ra), no jobs can arrive at 
the other queue before time To, since the handicap k < l/(2000e). So, on 
A2(n) n A§(ii), at most moderate and large jobs arrive at the other 
queue during H' . Together with the last paragraph and (7.17), for t = 2 and 
I = 6, this implies that, off of an event of exponentially small probability 
in h(2), when A\(n) occurs, at least |n — > n medium and large jobs 
arrive at the designated queue over H' , and hence over (0, n). This gives the 
inequality (7.17) for A%{n) c . 

We now demonstrate (7.16). Setting n = tn and i = 8, we first note that, 
by (7.17), 

(7.23) PAMto) n ^ 8 (to) c ) < e~ c ^ 2 \ 

On ^8(*o) ^ ^1(^0)) at least to = (/i(3)) 4 medium and large jobs arrive at 
the designated queue over (0, to] - The probability such a job has service time 
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at least h(3) is greater than (/i(3))~ 3 . So, when (/i(3)) 4 such jobs occur, the 
probability at least one of them has service time at least h(3) is greater than 
1 — 2e~ h( - 3 \ The bound (7.16) follows from this and (7.23). This completes 
the proof of the proposition. □ 

We now prove Proposition 7.2. The argument involves repeated couplings. 

Proof of Proposition 7.2. We apply induction to demonstrate the 
proposition. The case where i = 2 was demonstrated in Proposition 7.1, so 
we still need to show that (7.7) holds for i + 1 given that it holds for i. The 
argument for this involves repeatedly coupling the process X( £ < K )(-), with 
y( £ ' K )(0) = h{i + 1), to one of two processes X^(-) and X^ K+l \-), with 
Y( e ' K )(0) = y( e ' K+1 )(0) = h(i). In our couplings, we refer to the networks 
corresponding to X^' K \-), with Y( e ' K )(0) = h(i + l), on the one hand, and to 
the other two processes, on the other hand, as the first and second networks. 
As in the proof of Proposition 7.1, we may assume that, at time 0, the 
designated queue of the first network has only a single job, consisting of a 
designated job at h(i + 1). 

We compare the first network, with index (e, k) for given k < if" — i, to 
the second network with the same index and having the exact same number 
of jobs with the exact same residual service times and designated job, except 
that the designated job of the second network has residual service time h(i) 
instead of h(i+l). (Recall that the designated job is not required to have the 
largest residual service time in the network.) One can couple the processes so 
that their evolution is exactly the same, with respect to service of individual 
jobs and arrivals, until either (a) the residual service time of the designated 
job of the second network reaches h(i — 1), (b) an arrival at the designated 
queue has service time h(i), or (c) an arrival at the designated queue has 
service time at least h(i + 1). In the last two cases, the arriving job becomes 
the designated job in the second network and, in the last case, it becomes 
the designated job in the first network. 

By the induction assumption, the event in (a) occurs with probability at 
most l/h(i). We consider such an event a minor failure. At the time a at 
which it occurs, the designated job of the first network will have residual 
service time 

h(i + 1) - h(i) + h(i - 1) > h(i + 1) - h(i). 

When the event in (c) occurs, we refer to it as a minor success. 

At the time a the event in (b) occurs, we change the coupling. We now 
couple the first network to the second network with index (e, k + 1), and 
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having the same number of jobs with the same residual service times and 
designated queue as the first network, after ignoring the designated job of 
the first network. The second network therefore has one less job than the 
first network at time a. The job that has arrived at time a has service time 
h(i) and so will be the designated job for the second network. It has the 
greatest residual service time for the designated queue in the first network, 
except for the designated job, and will be served last except for that job; it 
will referred to as the associate designated job of the first network. 

Since the first and second networks differ only by the presence of the large 
designated job in the former, and since the handicap of the second network 
is one greater than that for the first network, the networks can be coupled so 
that their evolution is the same until the time a\ at which either the event 
(a) or the event (c) in the second to last paragraph occurs. Under (a), a 
minor failure occurs whereas, under (c), a minor success occurs. The minor 
failure occurs with probability at most l/h(i), at which time the associate 
designated job of the first network and the designated job of the second 
network each have residual service time h(i — 1) and the designated job of 
the first network still has residual service time at least h(i + 1) — h(i). 

Combining the preceding two couplings, it follows from the above reason- 
ing that the probability a minor failure occurs before a minor success is at 
most 2/h(i). One can repeat this reasoning up to n(i) times, where 

n(i) d ^ [h(i + l)/h(i)\ - 2, 

so that either (d) n(i) minor failures or (e) a minor success has occurred. 
At this time 02 > the residual service time of the designated job of the first 
network is still greater than h(i). The probability of the event in (d) is at 
most (2/h(i)) n ^. If the residual service time of this job decreases to h(i) 
before (e) occurs, we say a failure occurs. 

When a minor success occurs, the arriving job has service time at least 
h(i + 1) and becomes the designated job. If the service time of the arriving 
job is at least h(i + 2), we call the minor success a success and stop the 
procedure. If a minor success that is not a success occurs, one can repeat 
the above comparisons, starting as before with a single job at the designated 
queue. Letting Q(i) denote the number of minor successes occurring before 
a failure and B denote the event that a success occurs before a failure, it 
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follows from the previous paragraph and the definition of h(-) that 

n(i) 



P x [ Q(i) < eV^+i); B c ) < e 2 v / M^ 1 ) 



(7.24) 



(Q( 



h(i) 



2 s h(i+i)/h{i) 1 
< I — < 



h{i) J ~ 2h(i + 1) ' 



On the other hand, the probability that a minor success will actually be 
a success is at least 

(7.25) ^(M*))7(fc(* + !)) m > l/(Hi + !)) m > e-V^ 1 ). 

Together with (7.24), (7.25) implies that the probability of a failure occurring 
before a success is at most 

e VM;+i) i 



2/i(i + 1) 

< exp (_ e \^)l + 1 < —1—. 

Since success and failure correspond to the two events in the statement of 
Proposition 7.2, this implies (7.7) of the proposition. □ 

Acknowledgment. The author thanks an anonymous referee for a care- 
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